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Abstract 



We investigate the generation of quantum states and unitary operations that are "ran- 
dom" in certain respects. We show how to use such states to estimate the average fidelity, 
an important measure in the study of implementations of quantum algorithms. We re- 
discover the result that the states of a maximal set of mutually-unbiased bases serve this 
purpose. An efficient circuit is presented that generates an arbitrary state out of such a 
set. 

Later on, we consider unitary operations that can be used to turn any quantum channel 
into a depolarizing channel. It was known before that the Clifford group serves this and a 
related purpose, and we show that these are actually the same. We also show that a small 
subset of the Clifford group is already suficient to accomplish this. We conclude with an 
efficient construction of the elements of that subset. 
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Chapter 1 
Introduction 



1.1 Preface 

Quantum Computing is a multi-disciplinary subject that tries to make use of the laws of 
quantum mechanics that govern our physical reality. Since Richard Feynman illustrated 
how to simulate quantum mechanical systems in the 1980's |Fey82|, quantum computing 
gained a lot of attention. This is particularly due to Peter Shor's factoring algorithm 
Sho96j . that, provided a quantum computers can be built efhciently, would break most of 
the public-key cryptosystems in use these days. Besides this drawback, quantum computers 
would enable us to efhciently simulate molecular dynamics and thus would help developing 
new materials, and would dramatically improve our understanding of molecular biology, 
for example. 

These widespread applications of quantum computers lead to an enormous effort that 
has been put into their physical realization. However, it has not been possible to control 
more than a dozen qubits — far from the applications outlined above, which will need many 
dozens, hundreds, or even thousands of qubits. Out of the many obstacles, noise is the 
most prominent one that hinders the development of large-scale quantum computers. 

In this thesis, we devise a protocol to estimate the average fidelity, a global property 
of the strength of the noise associated with a quantum channel. We will see that it is 
sufhcient to use so-called mutually-unbiased bases (MUBs), as they will lead to the same 
average as the uniform measure over all quantum states. We re-discovered the previously 
known result that the states in a complete set of MUBs are a 2-design for quantum states. 
The contribution in this area is an explicit construction of circuits that generate the MUB 
states. 

From a different point of view, this thesis is concerned with the generation of quantum 
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states and unitary operations that are "random" in certain respects. A truly random 
quantum state on n qubits can be regarded as a uniformly distributed 2 n -dimensional 
unit complex vector (where two vectors are regarded as equivalent if one is a múltiple of 
the other). This uniform distribution is the Fubini-Study measure ( Definition IA. 7. lj) . and 
is defined by the property that it is invariant under unitary transformations. Since the 
space of possible states has 2 n+1 — 2 real degrees of freedom, it is infeasible to generate a 
distribution of states that is statistically close to a good approximation of this distribution 
with a polynomial number of operations (Section 12.3.50 . On the other hand, there are 
very efficient methods for simulating random states that are equivalent to this in certain 
restricted contexts. 

The following sections present an introduction to the fundamental concepts of quantum 
mechanics and quantum computing. As we will make use of concepts from Linear Àlgebra, 
the Dirac Notation, Group Theory, Functional Analysis, Topology, Harmònic Analysis, 
Finite Fields, and Finite Rings, we present some background on those areas in Appendix 
1X1 and refer to the appropriate literature in the respective area. 

Chapter |2]introduces measures to characterize noise and the average fidelity in partic- 
ular. It also shows current approaches to estimate the average fidelity. The Decomposition 
Lemma r2.3.13l will be used in the subsequent chapters. After that, the concept of mutually- 
unbiased bases is introduced formally and the known constructions are presented in chapter 
|ÏÏJ The chapter concludes with interesting open problems in that area. The main contribu- 
tions of this thesis are presented in Chapters 0] and where we present an alternate proof 
for 2-designs for quantum states and quantum operations, respectively. The contribution in 
both chapters are the different proof techniques and circuit constructions for the 2-designs. 
Chapter 0] gives the explicit construction for MUB states which are already known to be 
a 2-design. Chapter in contrast, shows that a subset of the Clifford group is already 
a unitary 2-design. An efficient construction of the elements of that subset is presented. 
Finally, we summarize our conclusions and outline interesting future areas of research in 
Chapter El 

1.2 Quantum Mechanics Framework 

At any point in time, the state of a classical system is well-defined, say the position of a 
car on a street at a given time t. In quantum mechanics, however, a system is not just in 
a single state, but in a superposition of potentially more than one states. Formally, the 
state of a quantum mechanical system is given by 
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where atj denotes the "amplitude" with which the system is in its basis state For 
example, these could be the position or polarization angle of a photon, the energy of an 
electron, or the spin of an atom. Although the system is in this superposition of several 
states, an observation will force the system into a single state \ifjj) with probability |<x,| 2 . 
The amplitudes must satisfy 

n 

Em 2 = 1 

to give a probability distribution over the states \ipj) upon an observation of the system. 
More formally, the state space of a quantum mechanical system is a complex inner product 
space 7i, commonly referred to as a "Hilbert space" . A vàlid state of a quantum mechanical 
system is described by a unit vector in H. 

The elements of the state space H are so-called "ket" vectors Denote by the 
dual of l^), which is a linear functional on 7i such that 

is the inner product between and \(f>). This notation is further shortened by letting 

<V|(!0» = <V!0>. 

Now, we can refine (fl.lj) by specifying that the l^) have to be pairwise orthogonal, i.e. 

= whenever i ^ j, such that \ i — 1,2, ... ,n} forms a basis for 7í. For the 
purpose of this work, we will not worry about infmite dimensional state spaces and will 
assume any state space 7i of finite dimension. 

The time evolution of a quantum mechanical system is either unitary if the system is 
isolated, or a measurement if the system is observed. 



Unitary Evolution The state of an isolated quantum systems evolves according to a 
linear function U that acts on the state as a vector. Therefore, we can think of U as a 
matrix that maps 

A*A (PA 



a 2 



02 



\a n J \f3 n J 

The resulting state = 0i\ipx) + • ■ • + f3 n \if) n ) must satisfy the normalization constraint 
as well, which requires U to be unitary and leads to the name of this evolution. It also 
implies that the evolution is reversible with U^ 1 = U ] (where denotes the complex 
conjugate transpose of U). 
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Measurement While an isolated quantum system evolves unitarily, different laws hold 
when the system is inspected by an observer. This process is called a measurement of the 
quantum system. In general, a measurement is given by a set of measurement operators 
{M m } on the state space of the system. The probability that the measurement of a state 
\ip) yields outcome m is given by 



p{m) = (ïP\MÍM m \ïP). 
After the measurement, the state "col·lapses" to 

p{m) 

The measurement operators satisfy the completeness relation 

m 

so that the outcome probabilities form a probability distribution 

mm \ m / 

There is a different view on the measurement process called "Positive Operator-Valued 
Measure", abbreviated as POVM, that is most often employed by physicists. It reduces 
the general measurement operators M m to a set of positive operators 

E m = MlM m 

such that the probability of observing outcome m is 

p{m) = (V|£ m |V>. 
The completeness relation now reads 

m 

The complete set of operators {E m } is usually referred to as a POVM with POVM elements 
E m . A POVM is especially well suited for the analysis of the measurement statistics when 
the post-measurement state is of no interest. It is simpler than the general measurement 
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description yet still powerful enough to describe the complete statistics of any quantum 
measurement. 

A special kind of measurement is a projective measurement or von Neumann measure- 
ment. A projective measurement is given by an observable M, which is required to be a 
Hermitian operator so that its spectral decomposition 

M = ^ mP m 

m 

exists, where P m is a projector onto the eigenspace of M with eigenvalue m. The eigenvalues 
m of M represent the possible outcomes of the experiment. The probability of measuring 
m is given by 

p{m) = (ip\P m \ip) 

and the state after a measurement with outcome m is 

Pm\j>) 

A projective measurement can be described as a general measurement with measurement 
operators M m = P m , which leads to simplified calculations as P^ = P m , P^ = P m and 
thus P r {P m = P rn . 

The easiest example of a projective or von Neumann measurement is a measurement 
in the Standard basis. Say we are given the state (jl.lj) and measure with respect to the 
basis • • • , \ip n )}- The measurement is given by the projectors P m = \ijj m ){ijj m \ that 

project onto the subspace spanned by \ip m ). As the \ip m ) are pairwise orthogonal, it follows 
that the projectors P m are pairwise orthogonal, too. Hence ^2 m Tn\ip m ) (ip m \ is a quantum 
measurement that yields m with probability \(ij)m\ip)\ 2 = \cv m \ 2 , leaving the system in the 
post-measurement state r^wlV'm) which is equivalent to \ip m )- 

1.3 Quantum Còmput ing 

The bàsic unit of information in classical computation is the bit, which can either take 
the value or 1. In quantum computation, the bàsic unit is a qubit, which can be in a 
superposition of and 1. We usually identify the basis states of a qubit with |0) and |1). 
Using the notation of quantum mechanics, we can say more precisely that the state of a 
single qubit is the superposition 

a |0) + «i|l) 
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for «cb^i £ C such that |a | 2 + |«i| 2 = 1- Hence the state space of a single qubit is a 
two-dimensional Hilbert space Ti.2- 

The state space of a system with n qubits is described by the n-fold tensor product of 
a single-qubit system 

n 2 n = H 2 ® • • • ® H 2 . 



V 

n times 



For example, a system consisting of two qubits has the four basis states |0)|0), |0)|1), |1)|0), 
and |1)|1), where the state |0)|1) means that the first qubit is in state |0) and the second 
qubit is in state |1). In general, an n-qubit system has basis states which correspond to all 
binary strings of length n. Instead of writing 

|6o)|6i)...|6n-i) 

we will write \bob\ . . . 6 n _i) or sometimes even shorter using a base- 10 representation of the 
binary number (ò &i • • • &n-i) 2 = Y^l=o ^bi- Therefore, we can write the basis states of a 
register of n qubits as |0), |1), . . . , |2 n — 1). The general state of such a register is given by 



a |0) + ai|l) + ··· + a 2 n_ 1 |2 rt - 1) 



where 



2 n -l 

E 

ï=0 



lotA = 1. 



Hence the state is described by 2 n complex amplitudes. Taking into account the normal- 
ization condition, we still seem to have 2 n — 1 complex degrees of freedom. Therefore it 
seems that a system of n qubits contains a huge amount of information that is encoded 
in its complex amplitudes, as opposed to n bits of information in a classical n-bit system. 
However, this is only true in a restricted sense: for the generally accepted definition of 
Holevo information |NC00| . it is known that a qubit contains not more than one classical 
bit of information. For a deeper elaboration of quantum information theory, we refer the 
reader to |N(J00| Ch. 12]. Despite those negative results, quantum computation and quan- 
tum information does offer provable advances over classical computation and information. 
Using de Wolf's words, it is "the art of quantum computing to use this information for 
interesting computational purposes" [JW99 . 



1.3.1 Turing Machine Model 



Classical computation can be described using a variety of different models. Two very 
prominent ones are the Turing machine and the circuit model. We will briefly address 
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the quantum version of the Turing machine and go into the circuit model of quantum 
computation in more detail. 

In analogy to classical probabilistic Turing machines, quantum Turing machines (QTM) 
were defined by Benioff |Ben82j and Deutsch |Deu85j . We adopt the notation set by Benioff 
[Bcn98 in a fairly recent survey. See |Meg05| for an account on the history of the QTM. 

A QTM consists of a one-dimensional infmite tape with celis labelled by the integers 
Z, a head, and a unitary step operator. Associated to each celi is a finite state space which 
we will usually define to be two-dimensional and therefore each celi will be one qubit. 
The head can be in a superposition of a finite number of orthogonal infernal states \l), 
l G {1, 2, . . . , L}, and a position j on the tape. In analogy to the classical Turing machine, 
we define the elementary actions of a QTM as moving of the head one step to the left 
or one step to the right, changing the state of the qubit at the position of the head, and 
changing the infernal part |Z) of the head state. 

Let 7í be the state space of the QTM and write the QTM's computational basis as 
\li3i§)i where \l,j) denotes the position of the head and the head's internal state. \s) = 
®m=-oo\s m ) 1S a basis state of the tape, where s m is a computational basis state of a single 
qubit. In order to avoid technical complications, we will assume 7í to have a countable 
basis. Therefore, a common requirement is that s m ^ for at most a finite number of m. 

The computation of the QTM is given by an initial state of the head and tape and 
the action of the QTM on each basis state \l,j, s m ). The action is specified by a step 
operator T that obeys certain locality constraints: The head must not move by more than 
one position at a time, and the operation on the basis state \l, j, s m ) may only depend on 
and change the state of the j-th qubit and the head's internal state. 

Although these definitions let a QTM seem analogous to a classical probabilistic Turing 
machine, it has not been found useful in the development of algorithms and in quantum 
complexity theory. Therefore, we will stick to the circuit model as our primary model to 
describe quantum algorithms. 



1.3.2 Circuit Model 

A classical Boolean circuit is a directed acyclic graph with input nodes, internal nodes, 
and output nodes. The circuit has n input nodes, n > 0. The internal nodes are the gates 
AND, OR, and NOT, but generally any universal set of gates will work equally well. There 
are m designated output nodes, m > 1. The input bits x are fed into the input nodes, 
and after all gates have been applied, the output nodes assume a value y. The circuit 
computes a Boolean function / : {0, l} n i— > {0, if the output nodes assume the value 
f(x) for all inputs x G {0, l} n . Figure fTTTl shows a simple classical circuit that computes 
f(a,b) = a®b. 
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The circuit model is linked to the Turing machine model by the idea of circuit families. 
A circuit family is a set C = {C n } of circuits, one circuit for each input length n. A circuit 
family decides a language L C {0, 1}* if for any n and any input x G {0, 1}™, the circuit 
C n outputs 1 if x G L and 1 if x ^ L. A circuit family C is uniform if C n can be computed 
by a Turing machine given input n. A uniformly polynomial circuit family is a uniform 
circuit family that can be computed by a Turing machine using space logarithmic in n, 
which implies a run-time polynomial in n. It also implies that the number of gates in C n 
is at most polynomial in n as well. The link between Turing machines and circuit families 
is given by the following theorem. 

Theorem 1.3.1. Pap9l] A language L C {0, 1}* can be computed by a uniformly poly- 
nomial circuit familiy iff L G P. 

A quantum circuit is a directed acyclic graph with input nodes, internal nodes, and 
output nodes. The inputs are qubits that are prepared in the state |0) or |1). The internal 
nodes are quantum gates, which are unitary transformations that act on a finite number 
of qubits. Restricting these gates to finitely many inputs allows for a comparison of the 
complexities of classical and quantum circuits. Usually, we allow for one and two-qubit 
gates. Table 11.11 shows some elementary quantum gates together with their circuit symbol 
and the corresponding unitary matrix. The transformation described by such a quantum 
circuit can be computed by taking tensor products of gates applied in paral·lel on disjoint 
sets of qubits and ordinary product of gates applied in series. A quantum circuit can then 
be viewed as a single unitary transformation on its n input qubits. 

Usually, some auxilliary qubits are needed during the computation, which are taken as 
needed and assumed to be initialized to |0). We will call them "ancillas" and require them 
to be in state |0) at the end of the computation. Otherwise, the result of the quantum 
algorithm might be corrupted by applying local unitary transformations on the ancillas. 
We will call this process "uncomputation" , as it is effectively achieved by reversing that 
part of the computation that made use of the ancilla. 
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Hadamard 



H 



V2 ll -1 



Pauli-X 



Pauli-F 



Pauli-Z 



Phase 



X 



Y 



S 



1 

1 

-i 

1 

1 
-1 

1 

i 



tt/8 



C-NOT 



T 



1 

e^/ 4 



/l 0\ 

10 

1 

\0 1 0/ 



Table 1.1: Elementary Quantum Gates from |JN CJ00(. p. 
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After all gates have been applied to the input qubits and the ancillas, the output nodes 
will assume a state \<p). It is measured in the còmput ational basis to produce a classical 
output string. Note that without loss of generality, all measurements can be placed at 
the end of the quantum circuit |NC0 0j. where classically controlled gates are replaced by 



quantumly controlled gates. Figure 11.21 shows a simple quantum circuit that computes the 
exclusive-OR of its input states, which is the controlled-NOT quantum operation. 

a) — 9— \a) 



\b) \a®b) 
Figure 1.2: A quantum circuit that computes a © b 

A distinct feature of a quantum mechanical unitary evolution is its reversability. A 
unitary operator U is invertible with inverse U~ x = U'. This implies that ancillas are 
needed in order to compute functions that are not bijections. For example, the trans- 
formation \x) h- > \PARITY(x)) is not unitary as its inverse does not exist. In order to 
provide quantum circuits with the ability to calculate those functions as well, we have to 
introduce additional qubits. It is known that for any classical Boolean function / from n 
to m bits, there is a reversible function Uf on n + m input bits that computes 

U f :(x,y)\-> (x,y® f(x)), 

where x G {0, l} n and y G {0, l} m . Furthermore, if a circuit for / uses T gates, there 
is a circuit for Uf that uses 0(T) gates and 0(T) ancillas. We will understand that Uf 
computes / reversibly and will use U / instead of / when we want to implement / with a 
quantum circuit. 

A universal set U. of quantum gates is a set of single and two-qubit gates such that any 
quantum circuit can be built using gates from U. It is known that the controlled-NOT gate 
and all single qubit gates form such a universal gate set, albeit one that is continuously 
parametrized. To end up with a finite set of single and two-qubit gates, we will loosen 
the requirements a bit and allow for approximations of unitary operations and call a finite 
set U universal if we can approximate every gate using only elements from U. Denote the 
error if we try to approximate U by V by 

E{U,V)=Ta^\\{U-V)m\, 

where || • || denotes the norm in the state Hilbert space, which is the Euclidean norm if the 
state space is finite-dimensional. It is known that the probability distributions obtained 
by a POVM on U\0) and V|0) are close in the following sense: Let M be a POVM and 
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let pu and pv be the probabilities of observing m upon measurement of U\0) and V|0), 
respectively. Then 

\pu(m) - p v (m)\ < 2E(U,V). 
It turns out that the discrete set 

U = {CNOT, H, P, tt/8} 

generates all quantum gates with an error e > as small as desired. This is known as 
the Solovay-Kitaev Theorem INÜDÜj . which states that we can approximate any quantum 
circuit consisting of m CNOT and single-qubit gates within an error of e > using 




gates from IÀ with c ~ 2. 

To conclude the circuit model, we want to state an obvious extension of this Standard 
model. So far we assumed that the elementary system of our quantum computer are two- 
level systems which we called qubits. It is possible to use (i-level systems as the elementary 
building blocks of a quantum computer for some finite d. We will refer to them as "qudits" , 
and we can define the circuit model for them in an analogous fashion. The only difference is 
the size of the matrix representation of individual gates. A single qudit gate U corresponds 
to a complex d x d matrix, and two-qubit gates have a matrix of size d 2 x d 2 . 



1.3.3 Quantum Algorithms 

It seems that the quantum mechanical principle of superpositions could be used to speed 
up information processing a lot. Imagine a boolean function 

/: {0,1}» ^{0,1} 

on an n-bit string. With a quantum computer, we could compute f(x) for all x G {0, l} n in 
paral·lel, thus leading to an amazing speed-up over any classical computer. More formally, 
given a quantum algorithm U / that computes / reversibly and the input state 

2 n -l 

!</,) = \x)\o) =#® n |o®")<g> |o), 

we can compute 

2 n -l 

w> = X>>i/(*)>- 

i=0 
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However, when measuring the final state, we will end up with a single answer \x)\f(x)) 
chosen uniformly at random with probability 2 _ ". This demonstrates that the speed-up 
achieved by quantum computers does not naively stem from the superposition principle. It 
rather stems from using interference effects between different states in a superposition to 
obtain a global property of a function. This abstract idea is what lies behind the quantum 
algorithm that factor large integers or solve the discrete logarithm problem efficiently. 

There are two bàsic ingredients that give rise to the speed-up of quantum algorithms: 
the quantum Fourier transform (QFT) and amplitude amplification |NC00| . The QFT is 
heavily used in Peter Shor's celebrated algorithm that factors iV = pq in time polynomial 
in logiV, as well as the discrete logarithm problem in any abelian group. We will only 
present amplitude amplification as this is the building block needed for the algorithms 
presented in this thesis. 

1.3.4 Amplitude Amplification 

Suppose we are given an algorithm A that acts on a Hilbert space 7í oí N qubits, including 
all work qubits. Let 

A|0> = |V> = a x \x). 
xe{o,i} N 

We can think of A as an algorithm that tries to guess the correct output and succeeds with 
a reasonably high probability Let X goo d denote the set of desired outputs x and let Xb a d 
be its complement, the set of undesired outputs. Thus we can rewrite 

The success probability of A is given by 

El I 2 
I I J 

which is the probability of measuring a state from the set of good states X goo ^. Conversely, 
let 

Pbad = ^ \a x \ 2 = 1 -Pgood 

denote the probability of measuring a bad state. 

If Pgood = 0, amplification is useless. If p goo d = 1, the algorithm is already exact 
and amplification is not necessary. Therefore, amplitude amplification may be used if 
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< Pgood < 1. In that case, the good and bad components can be renormalized to 

l^good) = 1 ^ \a x \ 2 and 

-^/Pgood x 



such that 

A\Q) = \lp) = SÍnó'l^good) + COS^^bad), 

where 9 = arcsin 1 Note that the imaginary parts are contained in I"0koo<í) an d l^bad), 
therefore we can indeed use real coefficients sin# and cos#, respectively. 

Inspired by Grover's search algorithm [NCOO , amplitude amplification uses two reflec- 
tions in the two-dimensional plane spanned by l^good) and |Vwi) to amplify the amplitude 
of the good state ([Mos99j and references therein). First of all, given any quantum state 
\<p) G H, let be a unitary such that U^A(p) = \<p) and U^Acf) 1 -) = —\4> ± ) for any state 
that is orthogonal to \<p). 

Now define 

= cos9\i) good ) - sm6\ïp had ) 

and observe that both {|^), l - ^)} an d {iV'good), l^bad)} form an orthonormal basis for a 
two-dimensional subspace Ti.2 C 7ï shown in Figure 11.31 

Definition 1.3.2. Define the amplitude amplification operator Q = AUL·AW^. 

Lemma 1.3.3. The amplificaton operator rotates the input state by an angle of 26 in 
the two-dimensional subspace, i.e. 

Q\ip) = sin(36>)|V> good ) + cos(36')|^bad)· 
Proof. To show this, we first note that 

u Ld\Í>) = -siné'l^good) + cos#|^ bad ). (1.2) 
Then we claim that AUL·A^ = Ujü Q \ = UL·. To see this, consider an arbitrary quantum 
state \ip) = a\ip) +/?|^ _L ) for some state l^ -1 -) orthogonal to \ip) and complex numbers a,/3. 
Now 

AUfaA*\<p) = AUfaA* (a#> + 

= ^(«10) + ^!^)) 

= A(a\0)-A^\^)) 
= f/i| 0> («|^)+W ± )). 
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y>----. 


,-l-^good) 




V 








\ /l^bad) 



Figure 1.3: The Subspace of Good and Bad States in Amplitude Amplificat ion. 

To conclude the proof, we rewrite the right-hand side of ()1.2|) in the |^)} basis: 

UtodH) = -sin6»|^ g ood) + COs6>|^ bad ) 
= cos(20)|V>) -sin(20)|$). 

Therefore 

W> = C^[ O) (cos(20)^>-siii(20)|^)) 
= cos(2fl)|^) + sin(2#)| : 0) 
= cos(36>)|^ good ) + cos(36')|^ bad ). 

Figure IT!4l shows the geometrical interpretation of the action of Q. □ 

More generally, we can show that Q rotates any input state in the subspace 7^2 by an 
angle of 29. We see that 

U^ d (sin0|^ good ) + cos0|Vw)) = (-sin0|^ good ) + cosçí>|V>b a d)) , 

hence is a reflection in 7í 2 about the axis defined by (t/w)- Analogously 

£/"Íjo) (sin (j)\ip) + cos ^Wí) = sinç^l^) — cos^l'i/') 

is a reflection about the axis defined by The main result of amplitude amplification 
now follows. 
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Figure 1.4: Amplitude Amplification as Two Reflections. 

Theorem 1.3.4. The application of k amplitude amplification rounds yields the final state 
Q k \^j) = sin((2A; + l)9)^ good ) + cos((2A; + 1)0) |^). 



Chapter 2 

Noise in Quantum Computation 



This chapter will introduce the concept of noise in quantum computing and the quantum 
operations formalism. The notion of the average fidelity is established and current proce- 
dures to measure this quantity are presented. Also, the important Decomposition Lemma 
12.3. 131 is proved. 

2.1 Noise in the Classical World 

We will start to look at noise in classical systems to establish an intuition for noise in 
quantum systems. Consider the simple example of a bit of information stored on the hard 
disk of a computer |NC00| . Initially, this bit has the value or 1. After a long period of 
time, the value of the bit can be corrupted by exposure to external magnètic fields and 
high temperatures. The easiest way to model this process is to assume a probability p that 
the value of the bit is flipped. 

With probability p, the value of the bit changes from to 1 and vice versa. With 
probability 1—p, the value of the bit remains unchanged. See Figure [2~T1 for an illustration 
of that process. The value of p can be estimated by sampling the external magnètic field 
surrounding the hard drive and the typical temperature distribution inside the computer. 
The value for p can be derived from the sampling data by using physical models for the 
magnètic field and the effect of temperature on the bit. 

To describe the general effect of the environment of our hard drive on the bit, we 
assume that we do not know its initial value exactly. We rather have knowledge about the 
distribution of the vàlues of the bit. Let po denote the probability that the initial state of 
the bit is 0, and p\ the corresponding probability for state 1. The effect of the environment 
can now be modelled as a change of this probability distribution. Our model predicts that 
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Figure 2.1: The Bit-Flip Error Model. 



the probability triat the bit is in final state after residing on the hard drive for a long 
time is q = pp + (1 — p)p\. Analogously for state 1, we have Çi = (1 — p)q + pq%. If we 
write probability distributions as two-dimensional real column vectors, we can express the 
noise as a linear transformation on the probabilistic distribution of the bit's state: 



To see why it is useful to consider a probability distribution over the initial states of 
our bit, imagine a circuit that consists of two gates A and B, both of which are noisy and 
either act correctly or flip the result. Although the input to the first gate is known exactly, 
we only know the probability distribution over the possible outcomes and 1 after A has 
been applied. In order to obtain information about the final state after B is applied to 
that intermediate state, we need to consider gates as acting on probability distributions 
rather than definite input states. 

We will make an important assumption about noise. We will assume that the noise 
affecting the second gate is independent from the noise affecting the first gate. This 
assumption turns out to be reasonable as the gates are usually physically separated in 
any implementat ion of that circuit. The assumption of the índependence of noise turns 
the circuit into a Markov process. The circuit starts out with an initial bit X, produces 
an intermediate bit Y and outputs a final bit Z. The probability distributions of the 
states of each two consecutive bits are linearly related by an equation similar to 12.11 The 
matrix is called the evolution matrix and is required to be stochastic to map a probability 




(2.1) 
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distribution to another probability distribution. If we represent probability distributions 
as column vectors and the action of the evolution as left multiplication by the evolution 
matrix M = (m^), tlien M is stochastic if and only if Yli m i,j = 1 f° r a ^ 3- m °ther 
words, the entries in each column of M sum up to 1. 



The study of noise in quantum computing has identified different kinds of noise and pro- 
vided a model to completely describe the effect of noise on a quantum system. We saw 
that classical noise is modelled using probability distributions over the classical states of 
the system. For quantum systems, we will use a similar approach. We will consider prob- 
ability distributions over quantum states of the system, so that the concept of probability 
distribution is merged with the quantum mechanical principle of superpositions and com- 
plex amplitudes. It was shown that density operators can be used to completely describe 
probability distributions over quantum states and completely describe the statistics of any 
probability distribution over quantum states. We will now see how noisy quantum opera- 
tions can be modelled to complete this picture. If not specified otherwise, Tt will denote 
the state space of the system in question. 

2.2.1 Quantum Operations Formalism 

The general evolution of the state of a quantum system can be described by a linear 
operator on the density operator of the system, which corresponds to the evolution matrix 
we have seen in the classical case. Analogous to the constraints on the evolution matrix of 
a classical system, we define quantum operations as the most general evolution of an open 
quantum system and we refer the reader to NCOO , Ch. 8] for an introduction to quantum 
operations and physical motivations. 

Definition 2.2.1. A quantum operation is a linear operator 



• £ is convex-linear. Given a finite probability distribution {pi,P2, ■ ■ ■ ,Pn} over states 



2.2 Noise in Quantum Computing 



£:L(H A )^L(H B ),p' = £(p) 
on the set of density operators on Ti such that 
• tr£{p) = 1. 



Pl> • • • > Pn, 
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• £ is completely positive. £(o~ A ) must be positive for any positive operator o~ A E 
L(Ha)- Also, for any additional system R, (1 ® £){a AR ) must be positive for any 
positive operator o AR on the joint system AR. 

We recali that the dynamics of a closed quantum system are described by a unitary 
U, which has the corresponding quantum operation £(p) = UpU'. However, we will have 
to deal with open systems in general. In that setting the system is denoted the principal 
system and is surrounded by an environment. The environment includes everything that 
will interact with our principal system. Without loss of generality we may assume that 
the system and environment start out in a product state p <g> p env . As illustrated in Figure 
12.21 the evolution of the joint system of principal system and the environment is unitary 
with some operator U. As we only regard the principal system, we have to trace out the 
environment after the interaction to get the final state of the principal system alone. 



P : 




: tr, 


™{U(p®p env )W) = £{p) 




u 








Penv \ 






J 





Figure 2.2: A Quantum Operation As Unitary Evolution in a Larger System 



Fact 2.2.2. The operation 

£(p) =tr cnv (U{p®p cm )U ] ) 

is a quantum operation. If the Hilbert space of the principal system had dimension d, it is 
sufficient to consider an environment of dimension d 2 . 

Fact 2.2.3. Every quantum operation can be written in an operator- sum or Kraus operator 
representat ion 

<d? 

£{p) = Y J A kP A\ 

k=l 

where the A k are the operation elements or Kraus operators and are operators on the 
Hilbert space of the principal system. They satisfy the completeness condition 

Y,A\A k = l. (2.2) 

k 

The converse also holds: any such operator sum gives rise to a quantum operation. 
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The Kraus operators reveal information about the structure of the noise as we will say 
in the following section. For that reason, determining the Kraus operators is an important 
goal for experimentalists |WHE + QÍ4] . 

Sometimes non-trace-preserving quantum operations are considered. Then Defmition 
12.2. ll is changed such that < tr£ (p) < 1 and tr£ (p) is understood as the probability that 
the process represented by £ occurs. The condition (J2.2j) on the Kraus operators changes 
to 

o < J2 A l Ak - 1 · 

k 

Non-trace-preserving quantum operations occur when one distinguishes between measure- 
ment outcomes that occur in the middle of a process. In our model, the system-environment 
interaction could be described by a unitary evolution followed by a measurement {M m } and 
the quantum operations could be separated according to the outcome m of the measure- 
ment. Then the operation £ m corresponding to outcome m would not be trace-preserving. 
However, we typically do not distinguish between the outcomes of a possible measurement 
on the joint system-environment state and thus we only need to consider trace-preserving 
quantum operations. 

Although not physically motivated, it will turn out to be of mathematical interest to 
consider general linear operators on L(7í), which we will call superoperators later on. It 
seems to be easier to obtain certain results for this general setting and deduce them for 
quantum operations later on. We are interested in superoperators that can be described 
by up to d 2 Kraus operators Af. which do not need to satisfy any constraints. These are 
called completely-positive superoperators. 

Fact 2.2.4. Any set of up to d 2 operators A k G L(7í) define a completely-positive super- 
operator 

£{p) = Y^A kP A\. 

k=l 

The reverse also holds. Any completely-positive superoperator has a Kraus decomposition. 
2.2.2 Single Qubit Noise 

We will illustrate how the quantum operations formalism is useful in characterizing noise 
by showing how the different kinds of errors on a single-qubit system translate into the 
quantum operations formalism and how the Kraus operators reveal the structure of the 
noise. 
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The bit-flip channel is the quantum analog of the classical bit flip error. It has operation 
elements 



from which we see that the channel either acts as the identity with probability p, or as a 
NOT gate with probability 1 — p. 

The phase-flip channel randomly applies a certain phase with a fixed probability 1 — p. 
The operation elements are 



We can also model a combined phase and bit-flip channel, which gives a bit-phase flip 
channel. It is characterized by its operation elements 



These examples show how the error model corresponds to the Kraus operators of a 
actual quantum operation implemented by a quantum computer. An even more interesting 
error model is the depolarizing channel, of which we will make heavy use later on. Although 
this is an error model that does not seem to reveal much information about the error at 
all, it will prové very useful. The depolarizing channel is a channel that either sends the 
input state to the completely mixed state | with probability p, or leaves it untouched with 
probablity 1 — p. This channel is most naturally described as a quantum operation 



E = y /pl,E 1 = y/ï^pX, 



E = y/pl, E 1 



E = v^l,£i = ^fï^pY. 



To find its Kraus operator decomposition, we observe that 

1 _ p + XpX + YpY + ZpZ 

2 ~ 4 



and thus 




Therefore the operation elements are 



Eo = V^fl, E, = fx, E 2 = ^Y, and E 3 = fz. 



Parameterizing the channel in a slightly different way, we end up with 



£{p) = (1 - p)p + "-{XpX + YpY + ZpZ) 
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and we can think of the channel as if it acts as the identity with probability 1 — p and as 
a random Pauli gate with probability p. 

Note that the depolarizing channel can be generalized to a <i-dimensional system as 
well and reads _^ 

Note that we will consider a slightly more general notion later on. 

2.3 Measuring the Impact of the Noise 

Determining the structure of noise is necessary to design efficient error-correction schemes. 
We cannot go into the details of fault-tolerant quantum computing here, but refer to [NCOOJ 
for an introduction to quantum error correction and fault-tolerant quantum computing. In 
this section, we will show how information about the structure of the noise can be revealed 
using current techniques. However, only one of them seems to be efficient as the number 
of required experiments for all other methods scales polynomial in the dimension d = 2 N 
of the system Hilbert space 7ï, which is exponential in the number of qubits N. 

We will first describe how noise is assessed in general and proceed to methods that 
specifically determine a certain property of the noise. 

2.3.1 Quantum Process Tomography 

Quantum process tomography is a combination of experimental and mathematical tech- 
niques to determine the elements of a matrix representation of a quantum operation 8 
and/or the corresponding Kraus operators A^. We will first introduce quantum state 
tomography, a prerequisite necessary to perform process tomography. See |NC00j for a 
general description of quantum process tomography. [Hav03j provides the tools for con- 
verting between different representations of the quantum operation. For a description of 
an actual experimental determination of £ of an implementation of the Quantum Fourier 
Transform, see 

State Tomography 

Quantum state tomography is a procedure to experimentally determine an unknown quan- 
tum state. Suppose we are given many copies of an unknown state p and our task is to 
determine the matrix entries of p in the computational basis. Note that it is essential to 
have many copies of p, as it is not possible to determine p given just a single copy for it is 
not possible to distinguish non-orthogonal states. 
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We will first look at the case of a single qubit as it provides good insight into the general 
procedure. Pick an orthonormal basis for L(7í), say {^1, -j^Z}. Any density 

operator p can be written as 

(l,p)l + (X,p)X + (Y,p)Y + (Z,p)Z 
2 

1 + tr(Xp)X + tr(Yp)Y + ti(Zp)Z 
2 ' 

where we used the fact that the inner product on L(7í) is the Hilbert-Schmidt or trace inner 
product, that the Pauli operators are self-adjoint, and that density operators have trace 
1. Quantum state tomography works because tr(Ap) can be determined experimentally 
using a projective measurement of the observable A, which can be any Hermitian oper- 
ator. It turns out that the Pauli operators are observables that are easy to measure for 
physical systems of interest. In general, any basis for L(7í) comprised of easily measurable 
observables is sufficient. 

Let M be an observable of a von Neumann measurement with spectral decomposition 
M = J2 m m Pm with orthogonal projectors P m . The expected value of a measurement of 
this observable on a state p is given by 

E{M) = ^mp(m) = J2 mt ?( P l P mp) = ^tr(mP m p) = tr Mp 

mm m 

using that P rn = P^ and P^ = P m . The coeficients in the representation of p in the Pauli 
basis for L(TÍ) can be interpreted as expected vàlues of the Pauli observables. 

It is easy to estimate tr(Xp), for example. Suppose we are given k copies of p and we 
measure the observable X for each pi. Given the spectral decomposition 

* = I+X+I-|->H 

we see that the outcomes of the experiments Xi are +1 or —1. The average value of these 
k experimental outcomes 

1 k 

x = - ^2 x i 

i=i 

is a reasonable estimate for tr(Xp). By the central limit theorem, we have that the ran- 
dom variable x is almost Gaussian distributed with expected value tr(Xp) and Standard 
deviation at most -4=. Analogously, tr(Fp) and tr(Zp) can be determined within a desired 
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confidence. One might of course use other Standard statistical techniques to estimate the 
expected value of a random variable. 

In the case of iV qubits one makes use of the following fact. 



is an orthonormal basis for the space of linear operators on the 2 dimensional Hilbert 
space H 2 n of n qubits. We will call this basis the product basis for the linear operators on 
H 2 n. We will call the operators tensor product of Pauli operators. 

By measuring all 4^ observables according to the procedure layed out above, we can 
get complete knowledge of the state p of an N qubit system. 

Process Tomography 

To determine the matrix elements of the quantum operation £ , we choose a basis for the 
space of linear operators on the state space of the system 7i. Let 7í be a ei- dimensional 
Hilbert space. As £ is a linear operator on density operators on 7í, it is completely 
characterized by its action on a basis of density operators. One possible basis is the 
product basis. However, in many experimental settings it seems to be more natural to use 
a different basis. Pick an orthonormal basis {|^i), . . . , \ipd)} for 7í, say the computational 
basis {|0), |1), . . . , \d)}. Then the set of density operators B a = {a^'fi = : 1 < i, j < 
d} forms an orthonormal basis for L(H). Prepare the input states a^'fi and determine the 
resulting state £ (<7^' J )) using quantum state tomography. 

This gives us the matrix elements of £ as a supermatrix. Using the orthonormal basis 
for linear operators on H introduced above, we can represent a density operator p as a 
d-bj-d complex-valued matrix with matrix entries pij such that 



Fact 2.3.1. The set 



{ 



1 



o- Sl ® a S2 <g> • • • ® a SN | Si e {/, X, Y, Z}, 1 < i < N} 
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However, we can also represent p as a column vector 

MA 

Pl,2 



P = Pl,d 
P%\ 

\pd,dj 

Then £ has a so-called supermatrix representation as a linear operator on a <i 2 -dimensional 
vector space of column vector representations of p. We will use the notation £ when we 
refer to tiris representation of the operator £. Hence £ can be represented as a d 2 -by-d 2 
supermatrix £. The experimental setup outlined above will give the matrix elements of 
£ in the supermatrix basis. Each of the input states a^' will yield a column E^ of the 
supermatrix £ = (£k)f =1 where we used a total order on the two-index structure to 
map it onto the single index k, say (1, 1), (1, 2), . . . , (1, d), (2, 1), . . . , (d, 1), . . . , (d, d). It 
can be seen that this is exactly what gives the vector representation of p. 

The supermatrix representation £ is not very convenient in the study of noise. For most 
applications, the Kraus operator representation is more useful as it reveals the structure 
of the noise |NC00j , which we have seen in the examples for single-qubit noise in Section 
12.2.21 Let E( l, ïï be the matrix representation of the B a basis, hence it is the matrix 

with a 1 in the z-th row and j-th column and zeros everywhere else. 

Definition 2.3.2. The Choi matrix associated to a supermatrix £ is the matrix 

d 

3 = ^(E&n® l d )£(l d ®E^). 



Fact 2.3.3. [HavOH! IWHÏFoI] The Choi matrix 
spectral decomposition 



of a supermatrix £ is Hermitian with 



d 2 



k=l 



with all eigenvalues \k > as £ is completely-positive. 
representation of £ is given by the Kraus operators 



Then the Kraus operator-sum 



Ai 



for 1< k < d 2 . 
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It is also possible to convert a Kraus operator-sum representation of a quantum oper- 
ation £ to the superoperator representation £. 

Fact 2.3.4. [EAZ05J The supermatrix representation of £ with Kraus operators {A^} is 
given by 

È = Y, A l®A k . (2.3) 

k 

We will use £ to denote the superoperator and £ to denote its representation as a 
supermatrix using some fixed basis for L(TC) that will be clear from the context. Note that 
a superoperator is a linear operator on L(TC) and hence is more general than a quantum 
operation, which is a special case of superoperators. In the following section, we will make 
use of superoperators to obtain more general results that will prové crucial later on. 

2.3.2 Noise Estimation Scenario 

There are two very common scenarios where noise estimation in quantum computing is 
important: quantum algorithms and quantum channels. We will introduce both settings 
and explain what noise estimation means in both contexts. 

Quantum Channel 




Figure 2.3: A Quantum Channel. 



From a theoretical point of view a quantum channel is a quantum gate that implements 
the identity transformation. However, a physical realisation of a quantum channel will 
usually be noisy and will implement a quantum operation that is not exactly the identity 
operation. We are interested in the "distance" between the identity and the operation the 
channel actually implements. 

Quantum Circuit 

A quantum circuit on N qubits is a unitary transformation U in the 2^-dimensional Hilbert 
space 7í. We are interested in how close a physical realization of a quantum computer 
implements U. We will call the physical implementation £, where £ is the actual quantum 
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operation our quantum computer carries out when we try to implement U, as shown in 
FigureEOl We will later see that we can always think of the implement ation £ as a perfect 
implementation of U followed by a noisy quantum channel £. 




UpW P 




Figure 2.4: An Ideal versus an Actual Implementation of a Quantum Algorithm U. 



2.3.3 Distance Measures 

We are interested in the distance between a desired transformation U and the actual 
operation £. We have already seen that U can be expressed as a quantum operation with 
exactly one Kraus operator, U itself. We could employ the transformation (J2.3|) and try to 
find a distance measure between the supermatrix representations U* ®U and £. However, 
it has been proven more useful to define a distance measure on density operators and 
characterize how noise affects a single output state. 

We will start with a mètric that is derived from the Hilbert-Schmidt inner product on 
the space of linear operators on TL. 

Definition 2.3.5. The trace distance between quantum operators p and a is defined as 

D(p, er) = - tr \p — a\ 



where \A\ = V A^A is the positive square root of A^A. 

Fact 2.3.6. The trace distance is a mètric on L(7í). 

Although this measure gives rise to a mètric on the space of density operators, it is 
not typically used in the context of noise estimation. It is more common to use a measure 
called "fidelity" that also characterizes how similar two states are, hence it will give rise 
to a real number between and 1. It seems that the fidelity is more suitable for analysis 
and is thus preferred over the trace distance. 

Definition 2.3.7. The Uhlmann fidelity between two states p and a is defined as 



F(p, a) = tr J y/pvyfp 
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This measure does not give rise to a mètric, but it is symmetric and turns into a simple 
expression when one of the states is pure. Let p = \ip)(ip\ and observe 



using the cyclic property of the trace. As unitary transformations U map pure states to 
pure states, we have that 



We can now define the fidelity of a quantum channel and the gate fidelity as the 
Uhlmann fidelity between the desired and the actual outcome of a channel or a gate, 
respectively. 

Definition 2.3.8. Let £ denote the actual quantum operation representing a quantum 
channel. The channel fidelity for an input state |?/>) is 



Definition 2.3.9. Let U be the unitary operator corresponding to a quantum gate. Let 
£ denote the quantum operation of the actual implementation of U. The gate fidelity for 
an input state is 



We will denote the gate fidelity by Fu(\ip)) if £ is clear from the context. 

In order to simplify our discussion we will treat a quantum channel as a quantum 
algorithm that implements the identity transformation. From the following definitions and 
results for quantum algorithms one obtains the corresponding definitions and results for 
quantum channels by replacing the operation U by the identity operation 1. 

It is not very convenient to have the fidelity of a quantum channel or a quantum 
algorithm defined for a single state. There are two ways |JNC00j to proceed towards a 
fidelity measure that is independent of the input state. Analogous to the study of the 




WWW 



F{U\ip)(^\U\a) = (V>|Z7W|V>). 



F(|Vi>^| > £(|^><V'l)) = ^W>^l)l^>- 



F w (U,£) = F(U\^){,p\U\£m^\)) =^\tf£{\^)^\)U\i>). 
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worst-case and average case behaviour of algorithms in theoretical computer science, we will 
look at the minimum and average gate fidelities. The minimum gate fidelity corresponds 
to the worst-case behaviour of our implementation of a unitary U, whereas the average 
gate fidelity is a number associated to the average behaviour of our implementation. 

Definition 2.3.10. The minimum gate fidelity is the minimum of the gate fidelity taken 
over all input states \ip). Hence 

F min ([/,£) ^miiiF^lV)) = ^<V^£(MMW>· 

W m 

Definition 2.3.11. The average gate fidelity of a quantum algorithm U with implemen- 
tation £ is defined as 

F ave (u,s)= í F m (u,£)d\^)= í MrtewwDuwdW) 

JF-S JF-S 

where the integration is with respect to the Fubini-Study measure (Definition EZQ) . 
2.3.4 Introduction to Fidelity Estimation 

The following sections and the main result in this thesis will be devoted to estimating 
the average gate fidelity. Let £ denote the quantum operation that represents the actual 
implementation of a quantum algorithm U. Let 

d 2 

£(p) = J2 A *P A l 

k=l 

We can factor out U from the Kraus operators to get a quantum operation £ that does 
not depend on U, i.e. an operation such that £(UpW) = S(p). We can think of £ as 
the quantum operation that just characterizes the cumulative noise introduced by the 
implementation of U and the experimental control. Let = A/-W be the Kraus operators 
of £ and observe 

d 2 

£{UpU ] ) = J2 E * U P UÍE k 

fc=i 
d 2 

= ^A k U ] UpU ] UA\ 

k=l 
d 2 

k=l 
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We will see later that the average fidelity will not depend on U but only on the cumulative 
noise described by S. 

2.3.5 Fidelity Estimation using Quantum Process Tomography 

For the easiest case, assume we already have all the matrix elements of S. Now the average 
fidelity can be computed from that using a direct calculation EA ZÜ5] . We will show a 



general formula for averages over the Fubini-Study measure and derive an explicit formula 
for the average gate fidelity as a corollary. 

Definition 2.3.12. Define the representation U of U G U(d) on L(TC) as Up = UpW 
for all p G L(H). Note that this is the usual action of U(d) on density operators which 
we extend to all linear operators. Furthermore, we will call a superoperator A unitarily 
invariant if ÜAÜ ] = À for all U eU(d). 

Lemma 2.3.13. Let A be a unitarily invariant superoperator and X G L(TC). Then 

Proof. The representation U is reducible. Denote M° C L(7í) the space of traceless 
linear operators, and let M\ = {ctd \ c G C} C L{7í) be the subspace of multiplies of 
the identity operator. There is no non-trivial subspace of M° that is invariant under 
U (d) |Boe67| IBoe70j and M\ is the smallest subspace that contains the identity. Hence 
both subspaces are irreducible. Observe that every linear operator X G L(TL) has a 
decomposition into a traceless part and a múltiple of the identity: X = (X — ^pl) + 
hence L(H) = M% © M\. Hence the sets 

Uo = {Ü\m° I U g U(d)} 

and 

Ui = {Ú\ M i\UeU(d)}, 

with X\s meaning the restriction of X onto the subspace S, are irreducible with respect 
to L(H). 

A is unitarily invariant and it follows that AU = UA for all U G U(d). This commu- 
tation relation is also true for the operators restricted to the subspaces M° and M\ of 
L(7í). Schur's Lemma (Fact lA~8.5)l implies that the restriction of A onto each subspace is 
a múltiple of the identity operator. Hence for X G L{7í), 

A(X) =p\X - tiX-\ +qtiX- i 
\ d d 
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for complex numbers p and q. These can be determined by evaluating the superoperator 
A for certain operators. 

»/ x í 1\ 1 

A(l) = p 1-trl- J+gtrl- 
\ dl d 



= ql 



This gives q = tr j 1 ' . Now evaluate (i\A(a^'^)\j) for the elements of the orthonormal basis 
cr^ï oïL(H). 

(i\A(a^)\j) = p(i\ (*M - 5 id ^j\j) + qSt-m 

With the inner product (X, Y) = ti^X^Y) on L(H) and the cyclic property of the trace, 
we compute the value for p: 



trÀ = (a {i ' j) ,A(a {i ' j) )) 

d 

= tr ((«t^ao^)) 
= é tr (b')^i A (^ J) )) 



d 



d 



d 2 p — p + q = (d 2 — \)p + q 



□ 



We will represent the previous lemma in a slightly re-arranged form to ease further 
calculations. 
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Corollary 2.3.14. Let A be a unitarily invariant superoperator and X e L(7í). Then 

A(X) = pX + qtiX- v (2.4) 
d 

where 
and 

trA(l) 

We can simplify this expression if we assume more structure on the superoperator. 

Corollary 2.3.15. A trace-preserving unitarily invariant quantum operation A is a depo- 
larizing channel 

A(p) =pp+ (1 -p)- 

where 

trÀ - 1 

Proof. By rearranging terms in Lemma (|2.3.13j) . we can see that A is a depolarizing channel 
if A is trace-preserving and restricted to density operators p. Using tr(A(l)) = tr 1 = d 
and trp = 1, 

A(p) =pp+ (1-p)-. 

□ 

We can now show that the average of a certain quartic function over the Fubini-Study 
measure can be explicitly calculated. We will need another lemma first to connect general 
superoperators to unitarily-invariant superoperators. 

Definition 2.3.16. Let A be a superoperator on L(TC). Define the twirled superoperator 

A T = [ VAVUV 

JU(d) 

where A T (X) = J u(d) VA(V^XV)VUV. 

Now we can show that twirling leads to a unitarily invariant superoperator. 
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Lemma 2.3.17. Let A be a superoperator on L{TL). Then the twirled superoperator 
is unitarily invariant. 

Proof. Pick U 6 U{d). With the change of variables V = WV and the invariance of the 
Haar measure on U(d), we derive 



(ÜA T Ü*)(X) = ( ü f vkvUvün (X) 

V JU(d) J 



U{d) 

UVMV ] U ] XUV)V ] U ] dV 

i (,/) 

V'A((VyxV)(VydV 



U{d) 

At(X). 



□ 



Theorem 2.3.18. Let M, N G L(H). Then 

(Í)\M\i))(i)\N\i))d\i)) = — -(tiMN + trMtriV). (2.5) 



F-S 



d(d 



Proof. Define the superoperator A(X) = MXN. We start with the unitary invariance of 
the Fubini-Study measure. It follows that we can replace integration over the set of all 
pure states by integration over the set of all unitary operators in U(d). By the invariance 
of the Haar measure over U(d), we can use any fixed pure state |^ )- 

MA(|V>)M)|VW> = / MViA(V\4> )(il>o\V*)V\il> )dV 

F-S JU(d) 

^ \VA(V^\^ )^ \V)V^\^o)dV 

U(d) 



= (Vol / VA(V^\^ )(HV)VUV\^o) 

JU(d) 

= (MM\A)(H)\A) 

where the second equation follows from the fact that the map ' : U(d) — > U(d) is an 
homeomorphism of U(d) onto itself as U(d) is a topological group. Now we use the repre- 
sentation Lemma \2 . 3 . 1 31 and the unitary invariance of A^ from Lemma \l. 3. 171 To directly 
apply Lemma Pi. 3. 131 we need to show that trA and trA(l) are U(d) -invariant, i.e. they 
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are not changed by twirling. Observe that Ü — U (g) W is a unitary operator on 7í <g> 7í. 
With the linearity of the integral and the linearity and the cyclic property of the trace we 
thus have that 

trÀ T = tr í ÜAÜ j dU = í trÜAÜUU 

JU{d) JU{d) 

tïkÜ ] ÜdU= i trÀdt/ = trÀ 

U(d) JU{d) 

J2^M\i){j\N\j)(i\ ^J2tT(i\M\i)(j\N\j) = $>|M|z> Y^(j\N\j) 
tr M tr N. 



Furthermore 



trAr(l) = tr / UA{U ] lU)U ] dU = I tr UA{UHU)U^dU 

JU(d) JU{d) 

= I tiA{t)U ] UdU = I trA{l)dU = trA(l) 

JU(d) JU(d) 



<U(d) JU(d) 

= tiMN. 

Therefore 

^\M\^)^\N\^)d\^) = (^IAr(IVoXV'ol)l^)) 

F-S 

tr MN 



tr MtrX / 1\ tr \ÍX 1 



d? — 1 \ (i/ d d 

1 



-(trMA + trMtr A). 



d(d+ 1) 

This finishes the proof. □ 

Using Theorem 12.3. 18| the average gate fidelity can be calculated given the superoper- 
ator or Kraus operator representation of the actual implementat ion 8. 

Corollary 2.3.19 (|EAZ05j). The average fidelity of a trace-preserving quantum opera- 
tion is 

F (un Eti|tr(g fc )| 2 + d 
avgl ' } ~ d* + d 

where are the Kraus operators of £ where U was factored out. 
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Proof. By the unitary invariance of the Fubini-Study measure we can introduce the change 
of variables = U\ip'). 



JF-S 

This shows that the average gate fidelity does not depend on U but rather on the cumulative 
noise £ introduced by the implementat ion of U and the overall experimental control. By 
Theorem 12 . 3 . 1 81 and the linearity of the integral, we can rewrite the average gate fidelity for 
the trace-preserving noise operation £ using its Kraus operator decomposition as follows. 



Although Corollary 12 . 3 . 1 91 provides an easy way to compute the average gate fidelity, the 
Kraus operator or superoperator representation of the cumulative noise operation £ needs 
to be known. This will in general require quantum process tomography to be conducted. 
We will provide two alternate approaches to estimating the average fidelity that do not 
require explicit knowledge of £. 

2.3.6 Fidelity Estimation using Quantum State Tomography 

The straightforward method to get complete knowledge of a quantum operation £ is to 
perform quantum process tomography However, this requires quantum state tomography 






Yfk=\ \ trE k\ 2 + d 
d(d+l) 



□ 
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on d 2 states, which requires of order d 4 experiments. Extending an earlier observation for 
a single qubit by Bowdrey et al. [BO S + 02| . Nielsen |Nie02| showed how to estimate the 
average gate fidelity using quantum state tomography on fewer states. We will assume 
that we have a trace-preserving quantum operation £. 

Single Qubit Case 

BQS^ r Q2] analytically evaluated the average gate fidelity for a single qubit. They described 
the average fidelity as a sum over four mixed states which seem to arise naturally in certain 
experimental setups: 

jE{x,y,z} 

They also gave a characterization using the six pure states corresponding to the axes of 
the Bloch sphere (see (|A.3|i ). which we will denote by p± x , p± y , p± z : 

F avg (U,£) = ± J2 te(U Pj UtE(p s )) (2.7) 

j£{±x,±y,±z} 

In both cases the states £ (y) and £ (pj) need to be determined experiment ally using 
quantum state tomography. 

General Case 

HHH99J showed a relationship between the so-called entanglement fidelity and the average 
gate fidelity. 

Definition 2.3.20. Let £ be a quantum operation on a Hilbert space TC of a system R. 
Now consider a second system Q with the same state space. Let p be a maximally entangled 
state on RQ. The entanglement fidelity of £ is defined as 

F e {£) = (p, (1 ® £){p)) = tr (pt(! ® S){p)) . 

The entanglement fidelity is a measure of how well entanglement is preserved by the 
operation £. The definition is sound as all maximally entangled states are related by a 
unitary transformation on R which does not change the value of F e (£). 
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Theorem 2.3.21. Let £ be a trace-preserving quantum operation and let U be a uni- 
tary operator. The following relationship holds between the average gate fidelity and the 
entanglement fidelity: 

W) = ^±I (2 . 8) 

Proof. We will consider the case of a quantum channel first, i.e. U = 1. Furthermore, we 
consider a depolarizing channel Ejy with channel parameter p. We can show (|2.8|) directly. 

F avg (^,i) = í tyienMMMdW) 

Jf-s 



f-s 



p|^)^| + (l-p)i)|V)d|V) 



fs d 
= P+(X-P)\ (2-9) 

Using the maximally entangled state \<j>) = Y2 x =i \ x )\ x )i we can compute the entanglement 
fidelity of the depolarizing channel as follows. 

F ent (S D ) = 1=J2( W \®( W \ ^(1®^) [l £(l*)®b>)(<y|®<í/l)^ ^X>>®^ 
1 d ( 1 

= ^ Z) H® H ((k)<í/l)®(pk>(i/l + (i-p)^ 



= P+(1-P)^ (2-10) 
From the explicit formulas (|2.9jl and (|2.1Ü|) . 

dF e (£ D ) + 1 



F. vg (£n,t) d + 1 



follows. 
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Using the technique of twirling from Definition 12.3. lfi| we can extend this proof from 
depolarizing channels to general channels. From Lemma l2.3.17| it follows that £t is uni- 
tarily invariant and Corollary ()2.3.15|) shows that Et is a depolarizing channel. It remains 
to show that the average channel fidelity is invariant under twirling: 

F avg (£ T ,i) = / rfe(u\i>)(i>\rf)uw\\i>)d\i>) 

JF-S \JU(d) J 

= 11 {Íi\U ] £{U\4)')(iP'\UU ] )U\4))dUd\i)) 

JU(d) JF-S 

= í í mscmmMidmdu 

JU(d) JF-S 

= í F mg (l 1 £)dU 

JU(d) 

We have substituted 1^') = U\ip) and used the unitary invariance of the Fubini-Study 
measure. For the entanglement fidelity, we use the fact that \(f>') = U\<f>) is also a maximally 
entangled state for any unitary transformation U. Therefore 

F e (S T ) = (0| / rf£(U\if>)(i/>\rf)UdU\<l>) 

JU(d) 

= í (<f>'\£(\Tp')W\)\<!>')dU 

JU(d) 

= £{W)^W) = F e {£). 
Thus ()2.8j) holds for general channels as well. It is extended to the gate fidelity using 

F avs (U,£) = í (Vl^dV'XVDW)^ 
Jf-s 

= í {ii/\utfe(tf\ii>)(ij>\u)\1>W) 

JF-S 

= F avg (l,£Üï) 

where we have substituted \ip) = W\i/j'). □ 

|Nie02j utilized this connection and showed how to calculate the entanglement fidelity 
experimentally using state tomography on £(a^'^) for a set of d 2 states cr( l >fi that form an 
operator basis. This yields the entanglement fidelity using 0(d A ) experiments and classical 
post-processing of d 2 x d 2 complex matrices. Using ([2.8)1 . we can subsequently compute 
the average fidelity. 
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2.3.7 Fidelity Estimation using Random States 
Introduction 

Another approach to calculating the average fidelity is to sample over uniformly distributed 
random quantum states. Given the definitori of the average fidelity in Definition l2.rS.lll 
Emerson et al. EAZ05J suggested a "motion-reversal" experiment shown in Figure l2~ïïl As 
shown in Corollarv 12.3.191 the average fidelity does not depend on the actual algorithm U 
in question. It only depends on the cumulative noise S. Assuming that noise introduced 
by reversing U will not increase the fidelity as the motion reversal does not cancel out 
noise, we will get a lower bound for the fidelity by implementing UW = t. 



o) —v-u-w-vï -^A = 



Figure 2.5: Circuit to Estimate the Average Fidelity using Random Unitàries V &r U(d) 



Theorem 2.3.22. Let p denote the probability to measure zero at the end of the estimation 
circuit. Then 

P = F avg (S). 

Proof. The theorem follows directly from the unitary invariance of the Fubini-Study mea- 
sure. We see that 

p = í (0\VUWS(WUV\0)(0\V*WU)V\ip)dV 

JU{d) 

= í (VIWX^DIVW) 

Jfs 

= -favg(^-)- 

□ 

p can be estimated up to an arbitrary precision using Standard tools from statistics 
as seen in Section 12.3. 11 Although this approach seems promising and would drastically 
reduce the amount of classical postprocessing that is needed for the other approaches, it 
requires the implementation of random circuits. It is known |NÜ00j for the case of d — 2 N 
that the decomposition of most unitary operators in V € U(d) requires 



/ dlQg(l/£) \ 

V loglogd / 



(2.11) 
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one and two-qubit gates to approximate to within e in the 2-norm for linear operators. Thus 
most unitary operators cannot be emciently realized on a quantum computer. Therefore, 
uniformly distributed random unitàries are generally not feasible practically. 

|ELL05j presents an approach to emciently approximate Haar-distributed unitàries. 
The idea is to start with a probability distribution / over a subset of S C U (d) that either 
generates the full set U(d) or a dense subset. In the first case, S will be continuously 
parametrized, whereas a discrete gate set S will be sufíicient in the second case. The key 
idea is to choose a gate Vi G S according to the distribution / for each step i = 1,2, ... ,m. 
Then the resulting probability distribution over V = YI4L1 ^ wm converge to the Haar 
measure either uniformly or in the weak topology according to a given test function. See 
sections IA.8I and IA.9I for an introduction to Fourier analysis on compact groups. For the 
remainder of this section, let G denote the compact topological group U (d) with elements 



Exponential Uniform Convergence to the Haar Measure 

Let fif G M(G) be an absolutely continuous probability measure on G over a subset 
S C G that generates G. This enables us to consider /i/ both as a measure and as a 
function / G We will further restrict ourselves to nice positive-defmite / (see 

Definition IA.9.25|) . such that we do not need to worry about the pointwise convergence of 
its Fourier series. For convenience, we will consider the function / for the remainder of this 
section, where / is the probability distribution of a single element g G G. If we pick two 
elements g\,gi G G independently according to /, the distribution of g — g±g 2 is given by 
the convolution of / with itself. Thus, the distribution over random circuits that consist 
of two gates which were chosen indepently according to / is given by / * /. We will repeat 
this process m times and have that the group elements g = gig 2 ■ ■ .g m G G are distributed 
according to 



In order to show uniform convergence of f* m to the Haar measure on G, we need two 
technical lemmas. 

Lemma 2.3.23. For the Haar measure, we have the Fourier coemcients 



geG. 
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Proof. Using the unitary invariance of the Haar measure, we observe that 

m(D s ) = í D s (g)dg = í D s {hg)dg = í D s {h)D s {g)dg 
Jg jg Jg 

= D s (h) [ D s (g)dg 
Jg 

for arbitrary h G G. It follows that D s (h)rh(D s ) = m(D s ) for all h E G, which implies 
D s {h) = 1 for all h G G or m(D s ) = 0. This implies fh(D°) = 1 and m(D s ) = for s > 1 
as D s {h) ^ 1 provided h^eeG. □ 



Lemma 2.3.24. 



f(D s ) < 1 for s > 1. f(D°) = f(D°) 



l. 



Proof. The case s = follows immediately from 



f(g)dg 



G 



= 1 



as / is a probability measure. For the case s > 1, let x G 7íd s be a vector in the represen- 
tation Hilbert space for the s representation. Now 



f(g)D s (g)dgx 



G 

< I \\f(g)D s (g)x\\ 2 dg 

' G 



f(g)\\D s (g)x\\ 2 dg 



G 



/ fia) \\*\\2 d 9 

JG 

/(f)^ll x ll 2 



G 



where we used that / is a probability measure and that D s is a unitary representation of 



G. To show that 



< 1, we assume 



/ f{g)D s {g)dgx 
Jg 



\\f{g)D s {g)*\\ 2 dg. 
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It follows from the triangle inequality of the norm and the linearity of the integral that there 
is a vector y G 7íd s such that for all g G G : _D s (g)x = £(g)y for £ G C{G) a continuous 
bounded complex-valued function on G. This implies that £ is a one-dimensional represen- 
tation embedded in the irreducible representation D s of dimension d s > 1. Contradiction 
and the lemma follows. □ 

First we note that f* m converges uniformly to the Haar measure over G if f* m con- 
verges uniformly to the constant function 1 G However, we do not just consider 

convergence with respect to the L 1 -norm, but rather pointwise uniform convergence. To 
analyse the convergence, we will consider the Fourier transform of f* m as the convolution 
of two functions (f),ip G ^(G) turns into a simple product in its Fourier representations 

0*^p(D s ) = 4>(D S )4)(D S ). 

We have 



rn 



which is an m-fold product of d s x d s complex matrices. Lemmas 12 . 3 . 241 and 12 . 3 . 231 alreadv 
show that the Fourier coefficients of f* m converge to the Fourier coefficients of the Haar 
measure as m approaches infinity. However, this does not prové uniform convergence yet. 
We need to show that the Fourier approach is vàlid and that we have uniform convergence 
indeed. 

Theorem 2.3.25. The probability measure f* m converges uniformly to the Haar measure 
over G. 

Proof. We note that f* m is nice positive-defmite if / is. Therefore for any m, the Fourier 
series of f* m converges pointwise |Edw72j . where the limit is taken over finite subsets 
P C G of irreducible representations of G. 

Uniform convergence is understood in the L°° norm meaning that for almost all g G G, 

hm r m (g) = 1 

m^oo 

where the limit is uniform, i.e. we want that for any e > there is a maximum number of 
convolutions M such that for almost all g G G and all m > M: 

\r™(g)-l\<e. 

In other words, we want that 

hm 11/^-11^ = 0. 
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In order to get this uniform convergence, we will consider the convergence of the Fourier 
coeficients of f* m and show that we can restrict ourselves to a finite number of Fourier 
coefncients f* m (D s ). The Peter-Weyl approximation from Fact IA.9.271 guarantees that for 
any real e > there is an N e such that for almost all g G G, 



f(g)-J2dstrf(D s )D s ( 9 y 



s=0 



< e. 



This "representation cut-off" |ELL05j enables us to consider the truncated function 

f Ne (g)=J2dstrf(D s )D s ( 9 y 



s<N t 



for further analysis. For almost all g G G and m > 2, we have from the triangle inequality 
and the Peter-Weyl approximation that 

\r m (g)-i\ < \r m (g)-mg)\ + \fT(g)-M 



< \r m (g)-fT(9)\ + 
= \f* m (g)-fZ(g)\ + 



Y^d s tif^(D s )D s (g) 

s=0 

J^d s trp"(D s )D s (g) 



s=l 



(2.12) 



where the last line follows from the case s = in Lemma f2.3.24l 

To bound the first term we claim that /jv e */at e = f*ÍN e almost everywhere. Using the 
Uniqueness Theorem (see Fact IA.9.18|) . it suffices to show that their Fourier coefficients 
are equal. Let D s G G and observe 



and 



f N ^f Ní (D s ) = fNÀD 



f*f Nc (D s ) = ff Nc (D* 



s\2 



f{D s f s<N e 

s > N e 

f{D s f s<N e 

s > N f 



and the claim follows. 
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Therefore for almost all g E G, 

\r m (g) - fZ(g)\ = 



< 



< 



f(g) {r {m - 1 \g)-r N i r 1) )(9) 

f(h) (r^Kg) - fÜr 1} ) iff^dh 
f(h) (r^Kg) - fÜr 1} ) (ff-'h) 



dh 



G 



dh 



j-k(m-l) _ j*(m— 1) 



'N e 



where we used that / is a positive function with integral 1 and that the convolution is 
associative. It follows by induction that 

lir m -/rlL<ii/-^iioo<^ 

The second term in (j2.12j) can be bound using the notation D s (g)^ to denote the single- 
column matrix that consists of the complex-conjugates and transposed j-th row of D s (g). 
We have 



tr f*™(D s )D s (g)ï\ = £ (/(L»*)^ (í?)]) . 



= d s \\f(D s y 



(2.13) 



where we used the definition of the matrix norm, the unitarity of D s (g), and that \xí\ < 
1 1 x| 1 2 for any column vector x = (£j)f =1 and any i = 1, 2, . . . , d. It follows with the triangle 
inequality and the definition of 



a € = max II II 
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that 

N e 

< ^2d s \tTf^(D s )D s ( g y 

s=l 

Ne 

s=l 

N e 

< <£4 

8=1 

Now we can choose a constant M t that will only depenc! on e such that for all m > M e , 

s=l 

Putting the bounds for both terms in (|2.12|) together, we arrive at 

nr m -iiioo<26, 

thus proving uniform convergence. □ 

We can improve the previous theorem to give the explícit convergence rate for nice 
positive-definite probability measures / G L 2 (G). 

Theorem 2.3.26. For a fixed dimension d and additionally / G L 2 (G), f nice positive- 
definite, f* m converges exponentially to the constant function 1. 

Proof. From the Parseval formula (see Fact IA.9.19J) and the fact that the induced matrix 
norm from Tí^ is the Frobenius norm \\A\\ = Vtr AA\ we conclude 

\\f{D s )\\ < ^trf(D s )f(D s V 

< -±=y/d a tlf(D')f(D'y 
Vd s v 

1 „ .„ 1 



J2d s trf^(D s )D s ( g y 



8=1 
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as / is a probability measure with integral 1. 

Hence we have that for any e > there is a "representation cut-off" S e such that for 
all s > S e 

\\f(D s )\\<^. 



As f* m is a nice positive-definite function for / is nice positive-defmite, we can use the 
uniform convergence of its Fourier series. Define 

a e = max \\f(D s )\\ (2.14) 

l<s<5 £ 

and use (|2.1Hjl to bound 

\P m (g)-i\ = J2 d ° tr f( D TD s (g)ï 

S>1 



S>1 

s=l s>S t 
s=l S>S e 

for all g G G. Using known formulas for the dimensions d s of the irreducible representations 
of U(d) from jVK91llVK98] (see Fart IA. 8.1 2jl . Emerson et al. |Fbbn5j showed that 

d- {m/2 - 2) 

s>5 E 

converges as long as m > 6. Hence exponential convergence of \\f* m — l||oo follows. □ 

This proves that by choosing an arbitrary single qubit or two-qubit gate according to 
the initial distribution / in each step, the circuit comprised of the composition of these 
gates will converge to a random circuit with a rate exponential in the number of steps. 
However, it is not clear how the convergence rate (j2.14)l depends on the dimension d = 2 N 
of the N qubit system Hilbert space. In order for the random circuit construction to be 
efficient, 

1 



a e = l-0 



poly(N, \] 
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It is not clear whether this can be achieved, as no reasonable bounds on the norm ||/(.D S )|| 
could be established so far. 

Provided an efficient pseudorandom distribution of circuits V with \\f* m — 1 1| oo < e 
exists, the average fidelity 

JU(d) 

could be estimated using the circuit shown in Figure 12.51 within a precision of e. 



Weak Convergence to the Haar Measure 

For many practical applications, pseudorandom unitàries need not be drawn from a measure 
that converges uniformly to the Haar measure. Also, if u,f is not an absolutely continuous 
probability measure that gives rise to a nice positive-definite function / 6 uniform 
convergence could not be shown so far. This is the case if the initial probability measure 
does not have a support over a continuously parametrized gate set S C U (d) , but rather a 
discrete set. Then fif will be a weighted sum of 5-functions over the elements G S. 

In that case, the random unitary approach can still give convergence, but in a weaker 
sense. Specifically, convergence to the Haar measure can be guaranteed with respect to 
certain test functions (f>(g) in the weak topology: 



lim / <t>{g)d»T = í ^9)dg. 
Jg Jg 



The most accessible test functions are trigonometric polynomials which are functions <fi 

such that = {s E G \ f(D s ) ^ 0} is finite. 

Using the orthogonality relations (see Fact lA.9.17|) . it follows that we need only consider 
those irreducible representat ions D s for which (p(D s ) ^ 0. However, it remains an open 
problem to actually calculate the convergence rate and to pick a suitable initial distribution 
/ in this setting. 



2.3.8 Alternative Approach using Many Additional Qubits 

If additional qubits can be added to the system, there is an easy way to determine the 
entanglement fidelity of a quantum operation S (BDSW96 . Using the approach described 
in Section l^.3.61 a motion-reversal experiment can be used to determine the average fidelity 
of S by implementing WU = t and assuming that the noise will not cancel out for it will be 
non-unitary. However, it is conceivable that this process might reduce the average fidelity 
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Figure 2.6: Circuit to Estimate the Average Fidelity using N ancillas. 



as two operations have to be implemented. Nonetheless, this will provide a lower bound, 
at least. 

Consider the circuit in Figure l2~Ul The first part creates the maximally entangled state 



x=0 

and the third part of the circuit is the inverse of that computation. Thus measuring the 
final state in the computational basis enables us to measure the entanglement fidelity. To 
see this, denote p the probability to measure \0® 2N ) at the end of the computation and see 
that 

p= (0|(1 ® £)\4>) = F e {E). 

p is the probability of a binary random variable and can thus be estimated to an arbitrary 
precision using the techniques outlined in Section 12.^.11 Using ()2.8j) and an estimate for p, 
we can calculate the average fidelity F avg (£, 1) = F avg (^, U). 



2.3.9 Discussion 

Both process tomography and Nielsen's approach using the entanglement fidelity work 
without any additional qubits, but require 0(d A ) experiments. This is exponential in the 
number of qubits iV = logd, hence these methods are deemed ineflicient. Furthermore, it 
requires classical processing of either d A x d A or d 2 x d 2 complex matrices, which is also 
inefficient. 
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The random circuit approach seems promising as it does not require additional qubits 
and only relies on Standard statistical techniques to estimate the probability p of observing 
at the end of the experiment. However, the convergence rate of the pseudorandom circuit 
construction as a function of the Hilbert space dimension d is not clear yet. It is a promising 
technique and further work should investigate the convergence condition for a test function 
like the average gate fidelity. 

The last approach estimates the average fidelity using N additional qubits and similar 
classical postprocessing as in the random circuit case is required. Provided additional 
qubits do not introduce too much additional noise and are experiment ally feasible, this is 
the preferred construction. However, in many practical settings, the number of qubits is 
still strictly limited and each additional qubit is quite expensive [NC00J. It seems to be 
necessary to actually gain information about the structure of the noise before additional 
qubits can be realised. Also, both the random circuit and the last approach assume that the 
fidelity of implementing the motion-reversal experiment WU does not differ significantly 
from the average fidelity of an implementat ion of U. 



Chapter 3 

Mutually-Unbiased Bases 



In this chapter, we will formally introduce the concept of mutually-unbiased bases and 
present the easiest constructions of these bases known so far. We will show that there are 
interesting open problems and nice applications beyond the context of this thesis. 



3.1 Introduction 

Two orthonormal bases B\ and B2 of a Hilbert space 7i of dimension d are called mutually 
unbiased if 

KiMV*>l = ^- (3-i) 

In a 1960 paper, Schwinger [Sch60 j realized that if a state \ip) is prepared as a basis state of 
Bi and measured with respect to the basis B 2 , it is just an equally weighted superposition 
over all basis states of B2 and vice versa. Hence no information can be gained about a state 

that is created as a basis state of either B\ or B2 with the choice of basis unknown. 
This idea also underlies the famous BB84 quantum key distribution protocol [BB84 . 

For a single qubit system, the three bases 

Bi = í|o>, |a», 

B, = { | +) = ^>, | _ ) = M and (3 . 2) 

form a set of pairwise mutually-unbiased bases or just "mutually-unbiased bases" for short. 
Sometimes, this is abbreviated by "MUB" . The absolute value of the inner product between 
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two vectors from different bases is which corresponds to an angle of |. On the Bloch 
sphere (see Section IA.3|) the angles double and hence the vectors are orthogonal in the 
geometry of the three-dimensional Euclidean space. Figure ÍI3.1|) shows the layout of Bi, 
B2, and B3 on the Bloch sphere. 



From the Bloch sphere it is apparent that we cannot find a fourth basis that is mutually 
unbiased to B\, B 2 , and B 3 . It was already suggested by |Iva81j that there can be at most 
d + 1 mutually-unbiased bases in a Hilbert space of dimension d. 



The concept of mutually-unbiased bases seems to have emerged in 1960 in a work by 
Schwinger [ScMOl ÍKR04t IKR05a t IIva8H IWF89j . Schwinger considered the problem of de- 
termining an unknown, possibly mixed state p provided sufficiently many copies of p are 
given. He introduced the term "complementarity" between two measurement operators. 
Given a system prepared in a basis state of a basis B\, a measurement with respect to 
a mutually-unbiased basis B2 gives no information about the state but just an equal dis- 
tribution over all states in B2. Although this fact has been known long before Schwinger 



z 




1) 



Figure 3.1: Mutually-Unbiased Bases on the Bloch Sphere 



3.2 History and Applications 
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Sch60], he showed that the measurement operators corresponding to measurements in 
d + 1 mutually-unbiased bases form an operator basis and he called such measurement 
operators "maximally non-commutative" . It was not until 20 years later that Ivanovic 

Iva81 explicitly showed how these measurements can be used to completely determine 
the unknown state p, thereby introducing the term "mutually 'orthogonal' " operators. 
Wootters and Fields [WF89J seemed to have coined the term "mutually-unbiased bases". 
They also showed that there are at most d + 1 mutually-unbiased bases in a Hilbert space 
of dimension d and gave the first explicit construction for such a complete set in case of 
prime power dimensions d = p k for p > 2. 

The applications of mutually-unbiased bases are diverse. Firstly, the obvious appli- 
cation was quantum state determination |lva81j . where measurements with respect to 
mutually-unbiased bases are sufficient for quantum state tomography (see Section 12.3.11) . 
Then, they have an application in quantum key distribution because of their nice information- 
theoretical property that a closely localized state in a basis Bi looks like an equal superpo- 
sition in a basis B 2 that is unbiased with respect to B 1 . The BB84 protocol [BB84J made 
use of the fact that 



1+) 
I") 



|0) + |1) 
y/2 ! 

|Q)-|i) 

V2 * 



10) = ^ -7= — -, and 
x/2 



|1> 



!+)-!-) 
V2 



and hence an eavesdropper cannot obtain any information about a state prepared as a basis 
state of either B\ or B2 if the choice of basis is unknown to them. This was also generalized 
to (i-dimensional systems [CBK + 02] . Buhrman et al. |BCH + 05] have recently shown how 
mutually-unbiased bases can be used to implement a quantum string commitment protocol. 

There is also an interesting application to the so-called Mean King's Problem, which 
amounts to determining the outcome of a measurement chosen randomly from a set of 
complementary observables. See [KR05bJ for an overview of the state-of-the-art of the 
problem and how mutually-unbiased bases play a role. Besides showing another application 
of mutually-unbiased bases, this article is fun to read. 

This thesis will use mutually-unbiased bases to estimate averages over the uniform 
measure of all pure states of a quantum system. We will present a result similar to KR05a , 
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where it was shown that mutually-unbiased bases can be used to estimate certain Fubini- 
Study averages. For that, we will give an alternative proof. 



3.3 Constructions 

The first explícit construction by Wootters and Fields jWF89j was simplified and extended 
by subsequent works jBBRV 02L IHïïa02l ILBZ021 IKH.041 IDïïr05l lPR05l |RBKSS05| . Although 
significant simplifications were achieved, constructions for the cases of odd and even prime 
power dimensions still differ. Ref. [KR05aJ gives a brief overview of most currently known 
constructions. 



3.3.1 Odd Prime Power Dimension 



Let TC be a Hilbert space of dimension d = p k , p an odd prime and I £ N. Denote the 
computational basis by {\x) \ x G GF(p k )} assuming an arbitrary ordering of the elements 
of GF(p k ). The following lemma will be crucial in the construction of an extremal set of 
mutually-unbiased bases for H. 



Lemma 3.3.1. Let p > 2 and let x be a non-trivial additive character of GF(p h ) 

p(X) = a 2 X 2 + ai X + a e GF(p k )[X],a 2 ± 0, 
be a polynomial of degree 2. Then 

x£GF(p k ) 

Proof. See jTNÒH Ch. 5] for a proof. 



Let 



□ 



Alltop |A1180j constructed sequences of complex numbers that exhibit very low corre- 
lations for use in spread spectrum radar and communication applications. It was not until 
recently that Alltop's work was rediscovered and found to give a construction for a set of 
p + 1 mutually-unbiased bases in prime dimension p, p > 5 [KROl] . Ref. |KR04j also gave 
the generalization of Alltop's construction to the case of prime power dimensions. 

This construction was improved by Klappenecker and Ròtteler KR04J to work for any 
odd prime power dimension p k . It is based on Ivanovics work for prime dimensions |Iva81j 
that was later generalized by Wootters and Fields |WF 89j. Different versions of the proof 
were given by Chaturvedi |Cha02j and Bandyopadhyay et al. [BBRV02J. We will present 
the proof by Klappenecker and Ròtteler as it is the shortest one known to the author. 
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Theorem 3.3.2. Let p > 3. Then the sets 

B a = {\r b )\beGF{p k )},aeGF{p k ) 

where 



(3-3) 



x£GF(p k ) 



together with the computational basis are a complete set of d + 1 mutually-unbiased bases. 
Proof. Again, we consider the inner product 



IW')I = 3 



E tr{(a l -a)x 2 + {b'-b)x) 

x£GF(p k ) 



(3.4) 



In the case of vectors from the same basis, we take a = a' and thus 



tü, 



tr((6'-6)x) 



1 ò = ò' 
b^b' 



x€GF(p k ) 

Now assume a ^ a'. Lemma (J3.3.1)) implies that 

and hence B a and _B a ' are mutually unbiased. As the coefficients of the basis states of 
the computational basis in ()3.3|) have absolute value -4=, B a is mutually unbiased to the 

computational basis for any a. □ 

For a qutrit system with dimension d — 3, a complete set of mutually-unbiased bases 
is now given by 



Bn 



1 

7! 



{(1, 1, 1), (l,w 3 ,wjj), (l,Wg,W3)}, 



#i = "^7|{( 1 ' ^3,^3), (1,^3, 1), (1, 1,^3)}, 



5-, 



V3 



{(l,w|,w 3 ), (l,w 3 , 1), (1, ljWs)}, and 



the computational basis, where we represented the states as column vectors with respect 
to the computational basis. 



56 



Random Quantum States and Operators 



3.3.2 Qubits 

In the case of n qubits, the dimension of the state space TC is d = 2 n , which is an even prime 
power. The construction in Theorem 13 . 3 . 21 breaks down in fields of characteristic 2, which 
is the case for GF(2 n ). Specifically, Lemma Í3 . 3 . II does not hold in fields of characteristic 
2. 

Klappenecker and Roettler (KR04| came up with a solution for the qubit case by con- 
sidering finite rings instead of finite fields. In particular, they employed a lemma analogous 
to Lemma 13.3.11 that holds in Galois Rings (see Section fA. 10.2|) . Let GR(4 n ) denote the 
Galois Ring with 4 n elements and let T n be its Teichmüller set (see Definition IA.10.29j) . 
We assume an arbitrary ordering of the elements of T n so that we can identify the elements 
of the computational basis with the elements of T n . 

Lemma 3.3.3. The exponential sum V : GR(A n ) -> C, 



m = 



tv(xy) 




if x e 2T n ,x ^ 
|r(x)| = < T if x = 



evaluates to 



'2 n otherwise 

Proof. See |Uar981 Lemma 3] for a proof. □ 

Using this lemma, the construction of a maximal set of mutually-unbiased bases in an 
n qubit system is very simple and elegant. 

Theorem 3.3.4. Let T n be the Teichmüller set of GR{A n ). Then the sets 

B a = {\4> a b )\beT n },aeT n 

where 

m = - 

/ 2 r 



l^) = ^E-4 r((a+26W k) (3.5) 



together with the computational basis form a complete set of 2 n + 1 mutually-unbiased 
bases. 
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Proof. The inner product between two vectors from bases B a and B a i evaluates to 

k«')i4 

For states from the same basis a = d , ()3.fij) simplifies to 

iw>i4 

hence B a is an orthonormal basis for any aeT„. For different bases a ^ a', Lemma 13.3.31 
implies 



m\i>t)\ = 




thus -B a and B a i are mutually unbiased for any a, a' G T n , a ^ a'. Any B a is mutually 
unbiased to the computational basis as the absolute value of each coefficient of the basis 
states in \tp%) is □ 

For the case of a single qubit, Theorem 13 . 3 . 41 recovers the well-known mutually unbiased 
bases (I3.2|) . We will now state the explicit example for two qubits. 

Observe that h(X) = X 2 + X + 1 G Z 4 [X] is a primitive polynomial. Hence GR(A 2 ) = 
Z 4 [X}/(X 2 + X + 1) has Teichmüller set T 2 = {0, 1, X, X 2 = 3X + 3}. The trace is given 
by tr(a + 26) = a + a 2 + 2(6 + 6 2 ). Therefore the mutually unbiased bases are given by 



B = 


!{(+!, 


(+1- 


+1,-1,-1) 


,(+l,-l,-l,+l), (+1,-1, +1,-1)}, 


B x = 


-{(+1,-1, -i, -i), 


(+1, 






B x = 


-{(+1, -i, 


(+1, 






B3X+3 = 


-{(+1, -i, 


(+1, 




(+1, +i, +1, -i), (+1, +i, -1, +i)}, and 



the computational basis. Note that although we used the Teichmüller elements specific 
to our choice of h(X), Fact IA. 10.251 guarantees that we will always get the same set of 
mutually-unbiased bases up to relabelling of the basis elements. 

A different construction makes use of the generalized Pauli operators that were intro- 
duced in the discussion of Quantum State Tomography ÍFact l2~ÏÏTTj) . It was discovered, 
extended and simplified by several authors |BBKV()21 ILBZ021 IDur()51 IKBKSS()5| . The 
constructions are based on the following theorem. 



E 

xeT n 



tr(({a' -a)+2(b' -b))x) 



(3.6) 



E 

xeT n 



lo 



tr(2(fe'-b)x) 



E(-d 



_-i\ti((b'-b)x) 



xeT„ 
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Theorem 3.3.5 ([LBZ02J). The set of 4 N — 1 non-identity generalized Pauli operators 
may be partitioned into 2^ + 1 sets of 2 N — 1 pairwise commuting operators. The common 
eigenbases are mutually-unbiased with respect to each other. 

3.4 Non Prime-Power Dimensions and Open Prob- 
lems 

We gave several constructions for a maximal set of mutually-unbiased bases in prime power 
dimensions. However, the situation is quite different if the dimension is not a prime power. 

Definition 3.4.1. Denote M(d) the maximal number of mutually-unbiased bases in a 
Hilbert space of dimension d. 

From [WF89 and the preceeding section, the following upper bound and lower bounds 
are known. 

Fact 3.4.2. M(d) < d + 1 for all d E N. M(p k ) = p k + 1 for p prime and fceN. 

For the case of non-prime power dimensions, only a fairly weak lower bound is known 
so far. 

Theorem 3.4.3 ([KR04J). Let d = p^p^ 2 ■ ■ ■ Pk k De the decomposition of d into its 
distinct prime factors p^. Then 

M(d) > min (pf' + 1) . 

i 

Proof. Let Tid be the Hilbert space of dimension d. Given a decomposition of d, we can 
decompose 

H d = H p «i ® H p «2 <g> • • • ® H p <* k . (3.7) 

Denote d(i) = p^ the dimension of the i-th Hilbert space H<*i in the decomposition (13.711 . 
Let = {Bi , . . . , B^h} be a maximal set of d(i) + 1 = pf* + 1 mutually-unbiased bases 
for H <*i . Denote the elements of the basis n?' by 

Bf = {\4^),...,\^ 3) m 

and define 

m = min {p* 1 + 1) . 
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Now we claim that the m sets 



A 3 



®···®^? J ' ) >I {1,2,... (/(<)}}, j 



1,2, ...,m, 



are orthonormal bases and form a set of m mutually-unbiased bases for 7í. Remember that 
the inner product of a tensor product evaluates as 



(\<Pa) ® \4>b), \iPa) ® |^b» = 



and thus the claim follows. 



□ 



It is not known whether this lower bound can be improved in any way. An obvious way 
to extend Theorem 13.4.31 is to allow for a more general construction of the sets Aj in order 
to increase their number. This can be done by mixing states from different bases in the 
tensor product 



® \i>t jk) ) 



where both the ji and li are picked according to some combinatorial criteria. However, this 
will not lead to a set of mutually-unbiased bases. Suppose d = P1P2, V\iVi distinct primes. 
Any construction that yields more than m bases will w.l.o.g. assign states IVÍ 1 ' 13 }® IV'Í 1 '^) 
and ) <8> (Vi ) to different bases. This leads to the inner product 



(tf' 1, ltf· 1, ><# l) lrf ;0 > 

P2 



Therefore, this naive construction cannot give us more than m mutually-unbiased bases. 

Furthermore, it was recently shown that the methods presented for prime power di- 
mensions cannot be generalized to non prime-power cases |Arc05| . It is conjectured 
|Zau99| IKR04| IKR05a| that M(d) is substantially smaller than d + 1 if d is not a power 
of a prime. However, even the maximal number of mutually-unbiased bases M(6) in a 
6-dimensional system is not known exactly. Theorem 13.4.31 gives M(6) > 3 only, and we 
know that M(6) < 7. It is an interesting open problem to even determine M(6). 

The problem of the maximal number of mutually-unbiased bases was linked to the 
problem of determining the maximal number of mutually orthogonal latin squares [WB04, 
IKR05al IHHH05j . It seems that there are connections between both concepts that should 
be subject of future research. In particular, further investigation into the existence of a 
maximal number of mutually-unbiased bases and a maximal number of mutually orthogonal 
latin squares could lead to fruitful results in either area. 



Chapter 4 

Scalable Efficient Noise Estimation 



This chapter contains the first main result, which shows that mutually-unbiased bases are 
a 2-design for quantum states using different techniques than the proofs known so far. We 
will give an explícit construction of circuits that generate states from a complete set of 
MUBs. We will use that construction to show how the average fidelity can be estimated 
efficiently. 

4.1 Introduction 

The main result emerged in joint work with Richard Cleve, Joseph Emerson, and Etera 
Livine. As it turned out, a similar result was already proved using different proof techniques 
by |KR05a| in general and |Bar02j for a specific construction. Our result is purely algebraic 
and relies on the explicit calculation of the integral in Theorem 12 . 3 . 1 81 using Schur's Lemma. 
Although our result follows as a corollary from [KR05aJ, the constructions of explicit 
circuits that generate mutually-unbiased basis states appear to be new. We will first 
present the main result in our language, present the earlier proof from [KR05aJ, and derive 
an efficient circuit that can be used to estimate the average gate fidelity. 

Later on, we will generalize the notion of a design from |KR05a[ IBarf)2[ IZau99j from 
states to unitary operators and present an outline for further research in that direction. We 
suggest that the techniques developed so far can be further generalized to derive efficient 
experimental protocols that reveal more information about the noise than the average 
fidelity can. Furthermore, this opens a new perspective on various notions of pseudo- 
randomness used in different quantum protocols. This unifies several applications from 
different areas of quantum computation in that they use the same "amount" of pseudo- 
randomness according to our classification. 
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4.2 Calculat ion of Haar averages using MUB vectors 

We show that the average of some quartic function over the Haar measure over all unitary 
operators on a complex inner product space <C d can be calculated using only the vectors 
of a maximal set of mutually unbiased bases. We assume that such a maximal set of 
mutually unbiased bases exists. As it is only known that d + 1 mutually unbiased bases 
exist for prime power dimensions, we restrict ourselves to these cases. The proof we will 
present below is original work and to the best of our knowledge has not been found before. 
We will discuss how this result can be derived as a corollary of a fairly recent result by 
Klappenecker and Roetteler KR05a in the next section. 

Let 7í = <C d be a complex inner product space of dimension d. Denote 

B a = {\r b ):b = 0,...,d-1} (4.1) 

the a-th basis of a set of d+ 1 mutually unbiases bases for a G {0, 1, . . . , d}. It follows that 

(1 a = a',b = b' 
a = a',b^b' 
Té a ^ a ' 

Recali that L(TC) denotes the inner product space of all linear operators on 7í, using the 
Hilbert-Schmidt inner product (A,B) = tr(À'B). Let W C L(TL) be the subspace of all 
Hermitian traceless linear operators on 7í. Note that the inner product simplifies to the 
real-valued (A,B) = tr(AB). 

Lemma 4.2.1. Let 

Íd-l d-1 "i 

5>W)WI : = 0,r ò G R l. (4.2) 

b=0 b=o ) 

Then W = ® d ^ W a . 
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Proof. Note that W a _L W a > for a ^ a' as 

/d-l d-l \ d-l d-l 



,6=0 6'=0 / 6=0 6'=0 

d-l d-l 



6=0 6'=0 

d-l d-l ^ 

E^E^ 



6 b> ^ 

= 

and dim W a — d — 1, dim = gP — 1 in real parameters. Thus W is indeed the direct sum 
of its d + 1 subspaces W a as the sum of their dimensions d + 1 yields d 2 — 1. □ 

Lemma 4.2.2. For each a, 

d-l 

n a (v) = ^|^)WI^|^ a )WI (4.3) 

6=0 

is an orthogonal projector onto W a . The operators {Il a | a — 0, 1, . . . , d} C L(W) form a 
complete set of orthogonal projectors onto W. 

Proof. Pick an arbitrary operator X e W a and observe that 

d-l d-l 



n a (x) = ^]A a )WlE r ^^>(A"ll^ a >WI 

6=0 6'=0 
d-l d-l 

= E E r « i^°> ww>w w> 

6=0 6'=0 
d-l 



6=0 

and analogously for a ^ a' 

d-l d-l d-l d-l 



E w> E^'i^') Wiio wi = EE^W) Wwm> wi 



6=0 6'=0 6=0 6'=0 

= E7^)WI = o 

6=0 
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since Ylt=o r b = 0- Therefore, II a is a projector onto W a . Completeness and orthogonality 
follow from Lemma 14.2.11 □ 

Corollary 4.2.3. For M, N G W, 

d 

E tr (n«(Af)n a (JV)) = tr MN. (4.4) 

a=0 

Proo/. Observe that Yl=ü tr (n o (M)n o (iV)) = ^J =0 (II a (M), n o (iV)) and the statement 
follows directly from the fact that the Il a form a complete set of orthogonal projectors. □ 

Theorem 4.2.4. Let M, N e W . Then 

d d-1 

j2Y,^\M\r b )w\Nm = kmn. (4.5) 

a=0 6=0 

Proof. 

d d-l d /d-1 



a=0 \ 6=0 



a=0 6=0 a=0 \6=0 



a=0 \ \6=0 



Ei^wi^iowi 

v6=0 

E tr (II a (M)II a (iV)) =tr MiV 



a=0 

by Corollarv D 
Corollary 4.2.5. Let M, N be Hermitian operators. Then 

d d-1 

EEW^WX^W) = tr MiV + trMtriV. (4.6) 

a=0 6=0 

Proof. Construct the traceless Hermitian operators M = M — ^pl,iV = N — and 
simplify the left and right hand sides of ()4.5|) using the bilinearity in the space of linear 
operators L(7i) on TC of (|4.6|) to get the result. □ 
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Corollary 4.2.6. Let M, N be linear operators on 7í. Then 

d d-l 

^2^2(^\M\^)(^\N\^) \y.\I.\ ■ tr.Uti·.Y. (4.7) 

a=0 6=0 

Proof. Construct the Hermitian operators 

Mi = M + M\M 2 = i{M-M*),N x = N + n\n 2 = i(N - N*). 

Using that both sides of (|4.6|) are bilinear forms (-, •) on the space L(7í) of linear operators 
on TC, observe that 

(M 1 ,N 1 )-i(M l ,N 2 )-i(M 2 ,N 1 )-(M 2 ,N 2 ) = (M, N) + (M, N 1 ) + (M f , N) + (m\ n') 

+ (M,N) - (M,N*) + (M\N) - (M\N 1 ) 
+(M,N) + (M, iV f ) - (M\N) - (M t ,iV t ) 
+(M,N) - (M,N 1 ) - (M\N) + (M t ,iV t > 
= i(M,N) 

and the statement follows. □ 

Corollary 4.2.7. For any linear operators M,N G L(7í), the average over the Fubini- 
Study measure is the same as the average over a complete set of mutually-unbiased bases: 

d d-l 

- F ~ S ' ^ ' a=0 6=0 

Proof. Combine Corollary 14 . 2 . 61 and Theorem I2.3.18I □ 

4.3 Mutually-Unbiased Bases are 2-designs 



Klappenecker and Roetteler actually showed a similar result a little earlier jKR05aj , which 
we will present in this section. We will start with a little bit of notation. 

The set of all quantum states forms a complex unit sphere S^ -1 in Tid- As global phases 
have no observable effect, we define an equivalence relation on quantum states by letting 
\ip) = \<p) if and only if there is 6 e [0, 2vr) such that \if>) = e ie \(p). Then CS d - 1 = S^ 1 / = 
can be thought of as the analog of the Bloch sphere for a <i-dimensional quantum system. 
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Lemma 4.3.1 (|KR05aJ). For all normalized \cp) G Ti. and k G N, 

|<y#>| 2fc c#> 



F-S 



(d+k-l\ 



rn 



The next ingredient is the notion of homogeneous polynomials. Denote Hom(k, l) Ç 
G[xi, . . . , xa, yi, . . . , 1/2] the set of all polynomials of homogeneous degree k in the variables 
xi, . . . , Xd and of homogeneous degree / in the variables yx,···,yd· We define the restriction 
of p G Hom(k, l) onto the complex sphere of quantum states with different observable 
effects as 

Po(|^)) =p(ai,...,a d ,oï, ■■■,oïd) 

where \ip) = Ylt=i a i\' l l J i) f° r an orthonormal basis {\ipi)}f =1 for Ti. It follows from the 
equivalence relation that defined CS^ 1 that fc = I in order for the defmition of p Q to be 
independent of the representat ive G C5 d_1 . Thus we define 

Hom(k, k) = {p | p G Hom(k, k)}. 

Now we can turn to the defmition of complex projective designs. 

Defmition 4.3.2. A complex projective í-design is a nonempty finite subset X Ç CS^ -1 
such that the "cubature formula" 

1^1 E p(i^>) = / p(ivow> ( 4 - 9 ) 



holds for any p G Hom (t,t), where we understand that p{\ip)) is a function in the coeffi- 
cients of in some orthonormal basis. 

We will refer to X just as a í-design if it is clear from the context that X Ç CS^ 1 . 

Theorem 4.3.3. Let X be a finite subset of CS d ~ l . The following statements are equiv- 
alent: 

1. X is a í-design. 

2. For all G H and all < k < t, 
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3. For < k < t, 

E = isè*;- ( 4 - n ) 

|V>,k>ex v k ) 

Proof. We will show that (1) implies (2). Let G 7ï and observe that p{\<p)) = 
\(ip\íp)\ 2k = (ip\ip) k (ip\ip) k is a homogeneous polynomial in Hom(k, k) Q . X is a í-design, 
therefore 

E l^b>l 2 * = / K^>rd|v> 



ife 



holds for all < fe < í. Dividing by KV'I^)! enables us to use Lemma 14.3.11 and thus the 
right-hand side evaluates to 

and (|4~TÜJ) follows. 

Next we will show that (2) implies (3). Summing (|4.1()jl over all e X and using 
that X consists of normalized unit vectors, we have 

|i/>)ex v fe J |v>6X 1 1 |^}ex 
IXI 



^ E k*>i s 



2k 

\m<P)\ 



for all < k < t and lECTTl) follows. 

Now we will show how (1) follows from (3). We will use that for the fc-fold tensor 
product, the inner product evaluates to (ifj® k \(p® k ) = (?p\íp) k . Define the vector 



F-S 



where integration is understood with respect to the coordinate functions of \ip)® k 
The inner product evaluates to 

H v > = mfí2 E \^)\ 2k -í í \(m\ 2k d\ V )d\^). 



F-S JF-S 
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From 1)4.11)1 . Lemma Í4 . 3 . 1 1 and from the normalization of the Fubini-Study measure follows 
that 

1 1 \iP),\<p)€X JF ~ S JF ~ S \ k ) JF ~ S \ k ) 

1 1 



^d+k-1^ ^d+k-1^ 

for all < k < t. 

From (v\v) = it follows that \v) = o, thus ([4.9)1 holds for every monomial in 

Hom(k,k) as \ifj)® h <g> gives all monomials in Hom(k, k) Q with coefficient 1 in its 

coordinate functions. By linearity, we conclude that the cubature formula holds for all 
polynomials in Hom(k, k) and thus X is a t-design. □ 

Some more notation is needed. The "angle" set A of a subset X Ç CS 01 " 1 is defined as 

A = {M V )\ 2 \ \1>),\<p)eX,\1,)ï\<p)}. 

For \ip) G X and "angle" a G A, the subdegree of \ip) with respect to a is 

da{\i/>)) = \{\<p)eX\ |<^#>| 2 = a}|. 

If for all a G A, d a {\ip)) is the same for all |?/>) G X, X is called a regular scheme. The 
states of a set of mutually-unbiased bases form a regular scheme. 

Theorem 4.3.4. The states of a complete set of mutually-unbiased based form a 2-design 
X in CS^ 1 with "angle" set {0, ^} and d(d+l) elements. 

Proof. The number of elements and the "angle" set follow from the definiton of mutually- 
unbiased bases (J3.1j) . We use statement (3) in Theorem 14.3.31 and show that í)4.11j) holds 
for k — 0, 1, 2. The k = case is immediate. 
For k — 1, we see 



rf 2 (rf+l) 2 

V ^ IV>>,|V>6X 



£ K * >|2 = 5^(< d+1 > tí 4+ <i < tí + 1) ) 



1 1 1 
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For k = 2, we have 

1 2 1 

+ j ~ d(d+l) ~ (^-y 

□ 

|KR05aj also showed the converse, which we state without a proof. 

Theorem 4.3.5. A 2-design in CS^ -1 with "angle" set {0, ^} is a union of d+ 1 mutually- 
unbiased bases. 

4.3.1 Equivalence to Our Approach 

The main result in Corollary 14.2.71 from the previous section now follows directly from 
Theorem 14.3.41 as 

KN>>)HVW)<VW> 

is a homogeneous polynomial in Hom(2, 2) D . The other direction follows as well. To see 
this, pick a monomial m(\ip)) = x a XbX c Xd € Hom(2,2) , where Xi denotes the component 
of the i-th computational basis state in some state \ip) = Y2í=i x i\i) ■ Let M = \c)(a\, 
N = \b)(d\ and observe that 

(mi^im^) = mc)(ammb)(d\\^) 

= x^x a x2x b = m(\il))). 

This extends to all homogeneous polynomials by linearity of (|4.8|) . thus it also shows that 
a complete set of mutually-unbiased bases is a 2-design. 

4.4 Efficient Fidelity Estimation 
4.4.1 Introduction 

We are now ready to show that the average gate fidelity (see Corollary 12.3. 19|) 
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can be estimated using a simple experimental setup. The denote the Kraus operators 
of £ (see Corollary I2.H.19|) . Figure EU] shows the circuit that can be used to estimate the 
fidelity of an implementat ion of U. 

However, we will need to make certain assumptions to end up with a circuit as simple 
as that. First of all, we need to assume that the cumulative noise characterized by £ 
is independent of the actual quantum algorithm U that is implemented in the quantum 
computer in question. Although £ can be thought of as to cover the noise induced by 
our experimental control, it is clear that the cumulative noise will usually depend on the 
gate being implemented. It seems natural that an implementat ion of the Quantum Fourier 
Transform will introduce more noise than the implementation of the identity operation 1. 
Furthermore, we need to assume that the additional pieces of the circuit used to measure 
the average fidelity introduce no additional noise. For our purposes, it would already be 
helpful if we could get a lower bound on the average fidelity. This is what our procedure will 
lead to, as the fidelity cannot increase when we implement U and the additional operation 

c/t. 

4.4.2 Using Mutually-Unbiased Bases 

We start with the basis state |0) and map it to a random vector \vfj%) in one of the mutually- 
unbiased bases B a chosen at random. The parameters a and b are chosen classically at 
random. Then we apply the "motion-reversal procedure" UW [EAZ05 and measure the 
result in the B a basis. To implement this, we will show how to construct a unitary V b a : 
|0) i— > \ip£)i apply it to |0) in the beginning and apply (V b a ) at the end and measure with 
respect to |0) and lO -1 ). Let p be the probability that the outcome is |0). According to our 
assumptions, the cumulative noise is characterized by £ and the quantum operation of the 
implementation is given as 

£(p) = J2 E ^UpU^UEt = J2 E *P E l 

k k 



10} 



v b a 



u 



(v b a Y 



Figure 4.1: Circuit to Estimate the Average Fidelity 



The construction covers quantum channels as well. Set U = 1 and we obtain the 
corresponding circuits that estimate the fidelity of a quantum channel. 
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Theorem 4.4.1. The probability to measure |0) is 



y, I trE k \ 2 + d 

'= Wl) (412) 



Proof. Using the main result in Corollarv 14.2.61 

d d-1 



\ > a=0 b=0 



a=0 fe=0 



Cor. SEm 



V " r ' a=0 fe=0 fe 

] 



Vï) ( tr (e^) +E tri5 * tr£; íj 



□ 



Corollary 4.4.2. The probablity to measure |0) equals the average gate fidelity. 

p = F avg (U,£) (4.13) 

Proof. Follows directly from Corollarv 14.2.71 □ 

Estimation of the average fidelity has been reduced to estimating the probability p. 
In the discussion of Quantum State Tomography in Section 12.3. 1[ it was shown how a 
probability can be estimated in / trials within a Standard deviation of at most 4=. Hence 
we need a constant number of experiments to estimate the average fidelity within a fixed 
absolute error. 

In order to justify the assumption that the additional circuit around U and W support- 
ing the estimation of the average fidelity do not cause any significant additional noise, we 
need to find constructions using as few additional qubits and gates as possible. The idea is 
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that additional gates and qubits generally require more experimental control which in turn 
introduces additional noise. In order to minimize the effect of this additional noise, we 
would like to keep the number of gates in the additional circuitry smaller than the number 
of gates used to realize U and U\ We would also like to keep the number of ancillas as 
small as possible. 



4.4.3 Prime Dimension Construction 

The construction of mutually-unbiased bases for prime power dimension was particularly 
intriguing and it turns into a very easy construction if the dimension d is not a power of 
a prime but just a prime p > 2. Let TCd be the Hilbert space of the N qubit system of 
dimension d = 2 . Let p be the smallest prime such that p > 2 N and let TC P be a Hilbert 
space of dimension p. It is known that p < 2 N+1 [ES03, Th. 5.9], which we will use to 
emulate dimension p in dimension 2 N+1 . 

It seems tempting to just add another qubit to the circuit and embed TC P into the 
2 JV+1 -dimensional Hilbert space TC2d- This is done by identifying TC P with the span of the 
first p basis vectors of the computational basis of H.2d, for example. Let £ be the quantum 
operation on TCd in question and let 



be its Kraus operator-sum decomposition from Fact 12.2.31 The map £' in the larger space 
H 2 d is given by 

S\p) = {£® t){p) = J^iAk ® l)p(4 ® !)■ 



This is not a trivial embedding of £ into TC P as the tensor product structure of £' forbids 
the direct use of on Ti p as this would only make sense if £' was a direct sum £ © 1. 
However, we will show how we can still make use of the embedded prime dimension Hilbert 
space Tip. 



|o> - v b a - u - w - (H ffl )t - 



Figure 4.2: Circuit to Estimate the Average Fidelity using MUBs. 

Suppose P is a projector from Ti,2d onto TCd- Clearly, it is also a projector from TC P onto 
TCd- Let further p denote the probability to measure zero at the end of the circuit shown 
in Figure where we average over the set of states 



B = {\r b ) = Pm\ae{o,i,...,p},be{o,i,...,p-i}}. 
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Then p is the average fidelity up to a constant factor. More precisely, we have the following 

Theorem 4.4.3. The probability to measure zero averaged over the the uniform distribu- 
tion of a and b is given by 

S.|t.ft|'-M 
p(p + l) 

Proof. We use that P also maps U ® 1 onto C/ by conjugation, hence 

The same argument can be used in conjunction with Corollary 14.2.61 and Theorem 14.4.11 
and the right-hand side of (j4.14j) follows. □ 

Corollary 4.4.4. The average fidelity is given by -F avg (£, U) = fjf^yP- 

It remains to show that we can efficiently construct the elements \ip%) of the set B given 
the gate V^ a that creates the state \ip%) from a complete set of mutually-unbiases bases in 
prime dimension p on iV + 1 qubits. 

Theorem 4.4.5. Let V^ /a denote the gate that maps 

C : |0> ~ h#). 

Let C(N) and D(N) denote its gate complexity and depth, respectively. Then V£ can be 
constructed using 0(N 2 + C(N)) single and two-qubit gates in depth 0(N + D(N)) using 
two ancilla qubits. 

Proof. Using V^ a on iV qubits and the first ancilla, we can construct = V^ a \0^ N+1 ^) 
oniV + 1 qubits and leave the second ancilla in the state |0). We can rewrite 

K)|O>=cos0|0o>|O>+siii0|0 1 )|l> 

where \</>q) and |0i) are normalized states on N qubits and cos 9 and sinó 1 depend on the 
amplitudes of the components of \ipl) that have a and a 1 on the last qubit, respectively. 
The value 9 can be determined from the construction for the mutually-unbiased bases V^ a . 
Observe that \4> ) |0) and \<j>i) |1) are the renormalized projections of \ip^) onto the subspaces 
where the ancilla is in |0) and |1), respectively. 

In order to make use of just one round of amplitude amplification ÍSection ll.3.4j) . we 
will rotate the second ancilla to create a "nice" angle. Choose 

cos f 

a = - 

cos 9 
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and apply the rotation 

ol — v/ï — a 



R 



y/1 — Oi OL 



to the second ancilla that is in its initial state |0). The state of the whole N + 2 qubit 
system becomes 

\(p) =«008^100)100) + ^ — a cos — asin^|0i)|lO) + asin0|0i)|ll). 

Substituting 



= vT^acos0|0o)|Ol) - VÏ^sin0|0i)|lO) + asin0|0i)|ll) 

we end up with 

= cos||0 o >|OO) + sin||^>. 

Using only one round of amplitude amplification (see Lemma ri.3.3|) . we can amplify the 
amplitude of 1 0o) 1 00) from cos | to cos3| = cos7r = 1. As this is a product state between 
the N qubits and the ancillas and the ancillas have been restored to their initial value, we 
can discard both ancillas after this step. 

In order to implement the amplitude amplification step, we need to implement the 
reflections U^ &d and Uq. We will formalize these first for the computational basis on the 
N + 2 qubit system. This yields 

^bad : \ x o x i ■ ■ ·x N x N+1 ) >-> (-l) [:E<2JVl |a;oXi . . . x N x N+1 ) 

and 

C/o 1 : \x X! . ..x N x N+1 ) i-> (-l) [xi=X2= - =XN+1=0] \x ü xi . ..x N x N+1 ). 

^bad ^ S J US ^ a phase-gate conditional on both ancillas being one, thus we can treat it as 
a two-qubit gate acting on the ancillas only. Therefore, 



u bad 



Uq is a phase gate conditional on all qubits being zero, which is equivalent to an (iV+l)-fold 
controlled phase 

' -1 N 



(- 


-1 








o\ 







1 
















1 





\ 











V 



P-r 



1 
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Figure 4.3: Circuit that computes the projected MUBs V^\0^ N ). The classical controls a 
and b are not shown here. 



It can be implemented 1NC00I using 0(N 2 ) single and two-qubit gates in depth O(N). Fig- 
ure 14.31 shows the complete circuit that computes V£ = QVl a , where Q = V^Uq {V^^U^ 
is the amplitude amplification operator. 

□ 

Therefore adding two qubits to the system enables us to employ the prime dimension 
construction for mutually-unbiased bases. In the remainder of this section, we will show 
an explicit gate decomposition of V^ /a . The arithmetic in F p will be implemented on N + 1 
qubits, so that we will need only one additional qubit. Furthermore, this ancilla does not 
need to be sent through the channel or be fed to the noisy implementat ion S as it can be 
discarded after the projected mutually-unbiased state has been constructed. 

The construction from Theorem 13.3.21 ensures that the initial angle is given by cos 9 = 
, therefore 9 = arccos v/|lr for all a, b G F p . In case of the computational basis, we 
do not need to use any amplitude amplification. In this case the resulting state either is a 
computational basis state for N qubits or the projection is \o), in which case we will define 
that the final measurement yields the correct value, depending on which basis state |6) is 
to be chosen. 

We will now show that V ft /a has an efficient gate decomposition. 

Theorem 4.4.6. V^ a can be realized using N +1 qubits with 0(N 2 ) gates in depth O(N). 

Proof. From Theorem 13.3.21 the states of a maximal set of p + 1 mutually-unbiased bases 
are given by 

w> = 4= E < 2+6 i*> ( 4 - 15 ) 
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for a, b G F„. This can be rewritten as 



We see from ()4.16j) that we need to implement the bàsic operations 

\x 

and 



>-(*£)» (4.16) 



\x) ~ «T |x>. (4.17) 
The implementation of (|4.16|) is straightforward using a phase gate 



P 



1 

62 ! 



on the z-th qubit, where we label the N + 1 qubits from to N. 
Lemma 4.4.7. For all \x), 



N 



Pi\x) = (uj p ) bx \x). 



J p) 

i=0 

N 



Proof. We use the decomposition x = s ^2 ií= qXí2 1 . Direct calculation gives 



\Pt\x) = QgiPilxi}) 

i=0 i=0 
N 



i=0 
N 



yjp) \· L· il 



\ \ {u p ) m x ' \x xi . . . x N ) 



i=0 



Up)"*"** \x) 



(u p ) bx \x). 



□ 
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For the implementation of (|4.17jl we observe that for x = Yjí=o x $ % i we get 



N 



N 



X' 



i+j 



N 



N 



\ i=0 ) i, j=0 

N N 

= y + y x P 2i = Y x i x i 2i+j+1 + Y Xi2 

i<j i=0 i<j i=0 

where the last equation follows from x 2 = x for x G {0, 1}. Therefore 

N N 

Hf = n Hf w ' m n k) 

i<j i=0 

The first term in (J4.18|) is a product of 

'N + ï\ _ (N + l)(N + 2) _ „ 



(4.18) 



conditional phases \0Jp) which can be realized using a controlled-phase gate. The 
remaining term corresponds to a single qubit phase gate on each of the N + 1 qubits. 



íú 



2 J a - ■ 



UJ 



2 4 a - 



íú 



2 N + 1 a 



íú 



íú 



2(i 



íú 



2 N + 2 a 



iú 



2 N + 3 a 



íú 



2^"a 



Figure 4.4: The Circuit that Maps \x) >— > u ax2+bx \x) , where u> = e 27n//p . 



Thus we end up with the phase injection circuit in Figure POl The phases are classically 
controlled by the vàlues of a and b. The complete circuit is shown in Figure 14.51 and 
shows how the Hadamard gates that create the initial superposition are conditional on 
whether the basis is the computational basis or one of the N other mutually-unbiased 
bases. We define a = p to denote the computational basis and restrict a G {0,1, ... ,p} 
and b G {0, 1, . . . ,p — 1} and assume a binary encoding of a and b into |~log(j» + 1)] = 
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N + 1 classical bits. Both the phase and the controlled-phase gates are conditional on the 
classical choices for a and b. If a = p, we just select the b-th state of the computational 
basis. The last part of the complete circuit is responsible for that. The total cost is 
(n+i)(n+2) _|_ 3(7V H- 1) = 0(N 2 ) single and two-qubit gates on iV + 1 qubits in a depth 
of N + 3 = O(N). This circuit can easily be reversed by reversing each of the one and 
two-qubit gates. The Hadamard is its own inverse, whereas the inverse of a phase gate is 
a phase gate with the inverse phase. 



|0> 
10) 

|o> 

10} 





























1 


F 










Phase ax 2 




Phase bx 

























Figure 4.5: The Circuit that Creates Given a and b Classically. 



□ 



4.4.4 Prime Power Dimension Construction 

The construction for a Hilbert space of prime dimension can be generalized to a Hilbert 
space of prime power dimension p N where p > 2 is a prime and iV G N is the number of 
qudits in the system, each a system of dimension p. In that case, the construction from 
Theorem 13.3.21 reads 

m = -^= £ <^ 2+ ^i*> (4.i9) 

for a, b G F p plus the computational basis for a = p by defmition. Using the polynomial 
representat ion from Section rA.10.ll we may choose to represent GF(p N ) in a vector space 
Fp so that each qudit encodes the coefficient of £\ i G {0, 1, . . . , N — 1} of x — x + + 
• ■ ■ + ^jv-i^" 1 G GF(p N ). We use £ to denote the formal variable in the polynomial 
representation in order to avoid confusion with the variable x that we will use to denote 
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an element of GF{p N ). We define x as the column vector 



that represents x in F^. 



As trace is a linear functional tr : — > F p , Fact IA. 10.201 guarantees the existence 

(t, x) where (• 



of a vector t G F^ such that trx 



denotes the usual inner product on 



F p . Furthermore, the multiplication in GF(p N ) is a linear function of F^ and thus for any 



y G GF(p ) there is a matrix M y G Fp such that yx = M y x. Hence 



tryx 



(t, M y x) 



(tj,,x) 



(4.20) 



where t, 



tMy. 



This representation ()4.20|) of the trace function enables us to rewrite ()4.19|) as 

1 



E 



(t.,x») ( ti ,x) 
p p 



x) 



x£GF{p N ) 



where t a and t& are classical vàlues that can be precomputed by the classical control. 

Beaudrap et al. dBCW02 showed how to implement a generalization of the Quantum 
Fourier Transform for qudits. 

Definition 4.4.8. The generalized Quantum Fourier Transform on qudits of dimension 
p relative to any nonzero linear mapping tp on GF(p N ) is defined as 



Fp N ,v '■ \ x ) 



ip 



E 



LO 



\y)- 



yeGF( P N ) 



Theorem 4.4.9 ([dBCW02j). Let p be a constant, N G N, and let ip : GF{p N ) 



F 



Then F p N^ can be performed exactly by a quantum circuit of size 0(N 2 



This makes use of the fact that every nonzero linear functional <p on GF(p ) can be 
represented as (p(x,y) = x T M^y where M v G Fp VxAr , which is a generalized inner product 
on F^. In the case of the trace function, we can reduce (f to the conventional inner 



product ip(xy) = x y and prepare the initial state as t b . Figure l4~ïïl shows how this can 
be done when i& is prepared by the classical control. Now the implementation in the proof 
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Figure 4.6: The Circuit that Creates 



y 



£GF(p N ) U p 



tr bx 



of Theorem 14.4.91 reduces to implementing F p on every qudit, which can be done exactly 
using O(N) gates. 

After that, the phases for the quadratic term need to be injected. Thus we need to 
implement the transformation 

\x) ^ Lü£ a ' x2 \ (4.21) 
Suppose the primitive polynomial that is used in the representation of GF(p N ) is h(Ç). 



First we observe that for x G GF pN , x = x + x±£ + ■ ■ ■ + 2at_i£ 

N-l 

x 2 = x,iXj^ l+ ^ mod h(£) 

i,j=0 

N-l N-l 

= ^ÀC +j h(0) + x iti 2i mod h(Ç)). 

i<j i=0 

Denote the vector corresponding to mod h(£). Then the inner product (t a ,x 2 ) 
can be written as 

N-x 

(t Q ) X ) = ^ x i x j(^ai £} • 
i,j=0 

where t a only depends on the number of qudits N and the classical parameter a, whereas 
ç(.hj) on iy depends on the number of qudits N. Thus ta = (t a >£^) can be computed 
classically for all vàlues of a, i, and j. Therefore we only need to implement two-qudit 

gates 

Phase^ : \a) ® \b) i-> uf' 3)ab \a) <g> \b) 
for all cross-term qudits in (|4.22j) . and single-qudit gates 

Phasei : \a) i-> uüf' l)a2 \a) 



N-l 
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for the diagonal terms. Hence we need (^) two-qudit gates and N single-qudit gates and 
the circuit has a depth O(N), as each qudit appears 2N — 1 = O(N) times in (|4.22|) . 

The computational basis can be integrated by classically controlling the phase gates on 
a 7^ P N ■ ïf cl = p N , we prepare the state \b) using classically-controlled addition gates on 
each qudit. This costs iV additional one-qudit gates. Therefore the whole circuit can be 
implemented using 0(N 2 ) single and two-qudit gates in depth O(N). Figure l4~7l shows the 
complete circuit. 
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Figure 4.7: The Circuit that Creates \ip%) Given a, b, t a 



ta ) and tb Classically. 



4.4.5 Galois Ring Construction 

Another approach to constructing MUBs in dimension d = 2 N is to use Theorem 13.3.41 
and employ Galois ring arithmetic. Although it is known how to implement Galois ring 
arithmetic classically |AbrQ4j and thus can be realized on a quantum computer with only 
polynomial overhead, it is not clear how to do it with only a modest number of additional 
qubits. The problem is that the Galois ring has 4^ elements, which require 2N qubits for 
a faithful representat ion. 

Remember the expression for the states of the mutually-unbiased bases (|3.3|) 



\r b ) = ^Y.^ {a+2b)x) \*)- 



Or, 



Although we can classically precompute the vectors such that the trace expression in 
()3.5|) reduces to inner products in 2^ using the ideas from the previous section, the 
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representat ion problem remains. Specifically, we can compute vectors tx< for a basis 
{X 1 \i = 0, 1, . . . , N — 1} for Zf. Thus we can reduce tr((a + 2b)x) to t^y, where y 
is the polynomial representation of a + 26. As a + 26 ranges throughout all of GF(A N ) 
and as we do not need not partition the states ()3.5|) into bases, we can ignore the Te- 
ichmüller decomposition. Using the precomputed vàlues of tx«, all we need to compute 
quantumly is the polynomial representation of x G T. However, a naive approach will 
require 2N qubits and enough multiplication operations to realize X >— > X J mod h(X) for 
all j = 1, . . . , 2 N — 2. Even using repeated squaring, this still requires N multiplications. 
Each multiplication seems to require at least N elementary gates in the quantum setting 
[BBF02] . thus we end up with at least 0(N 2 ) gates on least N ancilla qubits. 

4.5 Discussion 

Mutually-unbiases bases are a powerful tool in quantum information. They can be used to 
efnciently estimate the average fidelity in an experimental context. |KR05aj showed that 
they might even be more powerful as they form a 2-design for quantum states, for which 
we gave a different proof. 

We constructed efficient circuits that generate a state out of a complete set of mutually- 
unbiases bases. However, these circuits still need 0(N 2 ) gates, which might be reduced to 
only O(N) gates. As these circuits should be used for noise estimation, we need to make 
sure that the additional noise caused be the circuits is small compared to the circuit to be 
measured. As the approximate Quantum Fourier Transform can be realized on 0(iV log N) 
gates, it seems that 0(N 2 ) is still too high. 



Chapter 5 
Unitary 2-Designs 



5.1 Motivation and Notat ion 

In the previous section, we showed how to construct a set of unitary transformations 
U = {Uk | k — 1 . . . K} on a ci-dimensional Hilbert space Tíd, d being a prime power, such 
that 

K 

y2(HUlMU k \^ }(HUlNU k \^ } = / (V>|M|^>(VW><#> (5.1) 
k=i Jf -$ 

for any linear operators M, N e L(7í). We also saw that this is equivalent to the condition 
that IÀ generates a state 2-design given a fiducial initial state (|0) in the previous chapter). 
However, (|5.1|) only holds for a very specific initial state |^o) an d the constructions in the 
previous sections actually required to start in the |0) state. Although we might choose an 
arbitrary initial state and map it to |0), this might be hard to realize. From (j2.11|) 
we know that this might require a circuit of size exponential in the number of qubits 
N = |~logcí| needed to realize Tid- Hence we are interested in more general 2-designs which 
are based on a set of unitàries U that generate the states from a given initial state. The 
motivation is that the unitary invariance of the Fubini-Study measure enables us to turn 
any Fubini-Study integral into an integral over the Haar measure on U(d) with an arbitrary 
initial state \i(^o), 

í f{\i/>))d\il>) = í f{U\%))dU. 

JF-S JU(d) 

Let us first define how a state 2-design arises form a finite set of unitàries. 

Definition 5.1.1. A 2-design for quantum states is generated by a set of operators U C 
U(d) if there exists |^ ) <G H such that (jïïU) holds for all M,N e L(H). 
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The first extension is a set of unitary transformations that generate a 2-design for 
quantum states independent of the initial state |^o). 

Definition 5.1.2. A set of unitary operators IÀ C U(d) generates a 2-design for states any 
state if flïïüj holds for any M,N G L(H) and any \ifj ) e H. 

We can further generalize this into a unitary 2-design that gives the same average as 
the Haar measure on U(d) for operator-valued functions on U(d). 

Definition 5.1.3. A unitary 2-design is a set of unitary operators U = {Ui, . . . , Uk} C 
U(d) such that 



for all linear operators M, N,0 G L(7í). 

A unitary 2-design generates a 2-design on quantum statesfor any initial state. We can 
pick an arbitrary initial state \i[)q) an d multiply ()5.2j) from both sides to get 



Replace N = \ipo)(ip \, and (j5.1j) follows. 

We can also define a unitary 2-design in analogy to the definition of 2-design for states 
(Definition 14.3.2)) . where the homogeneous polynomial is of homogeneous degree (2,2) in 
the matrix elements of U G U(d) and global phases are ignored again. From (|5.2j) . we 
can conclude that a unitary 2-design will correctly give the integral over all monomials of 
degree (2, 2) in U. This is extended to all homogeneous polynomials of degree (2, 2) by 
linearity. The 2-design condition now reads that a set S Ç U(D) is a unitary 2-design if 
for all polynomials p(U) of homogeneous degree (2, 2), 



For the ease of notation, we will use Definition l5. 1.3l but fall back to this alternate definition 
to support certain arguments. 

Although the term "unitary 2-design" did not seem have appeared before, such an 
object has already been identified by [PBKLO04 for the case of a single qubit and by 
|DLT02j for an arbitrary number of qubits. However, they could only give an approximate 




(5.2) 




K 




(5.3) 
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sampling algorithm. We will give a different proof and a construction that yields circuits of 
smaller complexity and fewer random bits, while still being exponentially close to a 2-design 
in the induced superoperator norm and even in the stronger diamond norm 

We will introduce the notation of the Pauli operators and the Clifford group here. 
Consider a system that consists of iV qudits, each of dimension d with Hilbert space TCd, 
forming a system with Hilbert space 7ï of dimension D = d N . 

Definition 5.1.4. Let V(d) denote the set of d 2 generalized Pauli operators in dimension 

d !(;kpoi| 

X a Z b : a,6= 0,1,..., d- 1 
with the generalized Pauli operators acting on the computational basis as 

X : \j) _, \j + lmod d), Z:\j)^u j \j) 

with u = e 2ni/d . Note that some authors call these the Heisenberg- Weyl operators. 

It directly follows that ZX = ujXZ. Further, we have that 

X a : |j) »\j + a mod d), Z a : \j) » iv aj \j) 

and 

(X a ) f : |j) ~\j-a mod d), (Z b )) ] : \j) ^ u>-*\j). 
We have the commutation relation 

X a Z b = uj ab Z b X a , Z b X a = uj- ab X a Z b 

for a, b G Zd, which implies 

X ai Z bl X a2 Z b2 uj aib2 ~ Cl2 ^ 1 X a2 Z b2 X ai Z bl 

for all ai, ü2, 61, 62 G lid- We also note that the set V(d) = {X a Z b \ a, b = 0, 1, . . . , d — 1} is 
a basis for L(TCd). This can be seen using the Hilbert-Schmidt inner product, which yields 

tr{X a Z b yX a 'Z b ' = tr{Z b )\X a yX a 'Z b ' = ti(Z b ) ] X a '~ a Z b ' 
= tr X a '- a Z b ' (Z b ) ] = tr X a '~ a Z b '- b 

d-l d-\ 

= ^\x a '- a z b '- b \j) = J2" ib '- b)j (j\x a '- a \j) 

3=0 3=0 

d-l 

= à^aY,^''^ = dSa>,a6v,b. 
3=0 
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We can turn V(d) into an orthonormal basis if we normalize by As we want V(d) to 
be a set of unitary operator, we skip this normalization. 

Note that in the case of qubits, we have d = 2 and the Pauli operators V(2) are 
sometimes written as 

X°Z° = 1, 
X 1 Z° = cr x , 
X ü Z l = <T y , and 
X X Z X = o z . 

Taking all possible tensor products of N generalized Pauli operators yields the tensor- 
product Pauli operators which we will denote by 

V(d,N) = ^-^=X ai Z bl <g> ■■■®X aN Z bN \a t ,bi G Z d | 

but we will use the short-hand notation 

X ai Z bl <g> • • • ® X aN Z bN = x a z h 

for a, b G Z^. V(d, N) is a basis for L(7í) consisting of d 2N = D 2 elements. When d and 
N are clear from the context, we will write V(D) instead of V(d, N). 

The commutation relation of these tensor-product Paulis can be deduced from the 
commutation relation of the generalized Pauli operators. Using the short-hand vector 
notation, we see that 

z ai X a2 Z^ 32 uj ai ^ 32 ~ a2 ' 31 X a2 Z^ 32 X ai Z ai 



for a 1 ,a 2 ,b 1 ,b 2 G 1^1 . We can further simplify this expression by considering vectors 
x = (x a , Xè) G J? d N together with the symplectic inner product (x, y)s p = x a ■ y;, — Xi • y a 
where x a denotes the vector consisting of the first iV components of x and u • v denotes the 
usual inner product. Observe that the symplectic inner product is linear and (x, y)s P = 
— (y,x) 5p . Together with the notation 

P x = X Xa Z Xfc 

we can write the commutation relation in the concise form 



P x P y = a/ x ' y) ^P y P x . 



(5.4) 
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We note that -jj^Px is normalized, but we need the property that P x is unitary, so we skip 
the normalization. 

Sometimes, we will identify the elements of V(d,N) with integers j = 1,2, ...,D 2 . 
Ignoring global phases that are introduced by the commutation relation, we can treat 
V(d, N) as the group of tensor-product Pauli operators. From the commutation relations, 
we have that 

P P _ , ,-yax b p 

1 x 1 y — uJ r x+y 

Let V(d, N) by V(d, N) and define the equivalence relation P = Q if and only if there is 
«eC such that P = aQ. We can identify V(d, N) = V'{d, N)/ = using 

P P = P 

1 x 1 y — 1 x+í;- 

The identity element is given by P Q = l 0Ar . We will denote V(d, N) with multiplication 
defined by ignoring phases as the Pauli group with dimension (d, N). This equivalence re- 
lation essentially ignores global phases caused by the commutation relation. This approach 
is reasonable if we consider conjugation by tensor-product Pauli operators, which will be 
one of the main tools used in this section. 

Definition 5.1.5. Let A be a completely-positive superoperator on 7i. Define the Pauli- 
twirled superoperator 

1 ° 2 

where 

1 ° 2 

3=1 

for any p G L(H). 

A Pauli superoperator is a superoperator A such that 

D 2 
3=1 

for all linear operators p, where Pj G V(D). 

Definition 5.1.6. The Clifford group C(D) is the normalizer of the tensor-product Pauli 
group V(D) under conjugation, i.e. 

C(D) = {V G U(D) | VVV ] Ç V}. 

The Clifford group plays an important role in quantum error correction |Got97j and 
has been used before to show a similar twirling result that already shows that 
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5.2 The Clifford Group is a Unitary 2-Design 
5.2.1 The Previous Result 

In 2001, [DLT02J introduced the notion of a "Clifford Twirl" and showed a condition that 
is equivalent to a unitary 2-design. We will present that part of their result and show that 
it is equivalent to a unitary 2-design for U(D) where D = 2 N . 

Theorem 5.2.1. For all states p E H <g>H., 

7~zz~7T Y] (C®C)p(C ] ®C ] )= í (U ®U)p(U ] ®U ] )dU. (5.5) 

[ L> )\ „^,r, JU{D) 



\C(D)\ 

1 v n ceC(2 N ) 



Proof. The proof is given in Section A.l of |DLT02j . □ 

It follows easily that C(2 N ) is a unitary 2-design as the following corollary shows. 
Corollary 5.2.2. C(2 N ) is a unitary 2-design. 

Proof. States are Hermitian matrices of trace 1. First, we extend ()5.5|) to all Hermitian 
matrices using p' = p + ~p p l is Hermitian with trace 1. By the linearity of the sum and 
the integral, we only need to consider ~^ rp l. As C(D) and U(D) are unitary operators, 
(1531) also holds for ±^1. 

We extend (15. 5|) to all linear operators p E H®H using the fact that there is a Hermitian 
basis for TC <g> Ti. 

By choosing p appropriately, we can show the 2-design condition (|5.3|) for all mono- 
mials of homogeneous degree (2,2). By linearity, the result follows for all homogeneous 
polynomials of degree (2,2). □ 



5.2.2 A Different Proof 

Inspired by discussions with Daniel Gottesmann and jCha05l . we will give a different 
proof that the Clifford group is a unitary 2-design. The argument starts by showing that 
"twirling" a completely-positive superoperator by tensor-product Pauli operators gives a 
completely-positive superoperator with only tensor-product Pauli operators as operation 
elements. After that, the Clifford group symmetrizes their weights to give a unitarily 
invariant superoperator. In an argument slightly more complicated than Corollary I5.2.2| 
we deduce that this implies a unitary 2-design. 
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Lemma 5.2.3. Define Xj(P x ) — oj^'^ Sp - Then Xj is a character of V for any j. Further- 
more, for all j G j 7^ o, 

X 

Proof. Observe that 

Xj(P x )Xj(^y) = c^^uA"» 5 * 

= ^ x+ ^ = X j(Px + ,) = Xj(i 3 xP y ). 

As long as j 7^ o, Xj is a nontrivial character of P and ()5.6j) is the well-known character 
sum formula. See |LN94[ Ch. 5] for a proof. □ 

Lemma 5.2.4. Twirling a completely-positive superoperator A with the tensor-product 
Pauli group V yields a Pauli superoperator Ap. 

Proof. We use that the tensor-product Pauli operators Pj G V(D) form an orthonormal 
basis for 7ï, thus we can write the superoperator 

<D 2 

A(p) = J2A k pA\ 

k=l 

as 

A (p) = E E a k,rPrP E "M P s> 

where 



fe r£%f ae%l N 



The Pauli-twirled superoperator can now be simplified to 

AP(P) = ^ E ^E E «..rPrPjpP/^Pj 

= ^2 E E a *^ E P Í P rP iP P^P^ 



2iV 



^ E E^E wM * p ^ 

r.seZífr fe j 6Z ajv 
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where we used the commutation relation (|5.4j) . 
From Lemma 15. 2. ^ we have 

cj(j' r - s )s P = D 2 5 T)S . 

Thus we can simplify the expression of the Pauli-twirled superoperator to 

A P(P) = Jyi ^ ak,rak^D 2 Ò r ,sPvPPs 

r,seZ 2 d N k 



= ^ ^ Qyoy-PrP-Pr 

rez^ fe 




This shows that Ap is indeed a Pauli superoperator with real coeficients 




□ 

The following theorem shows how a Pauli superoperator can be twirled into a unitarily 
invariant superoperator using the Clifford group. As V(D) is by definition a normal sub- 
group of C(D), it suffices to consider C(D)/V(D) which is called the "symplectic group" 
SL(D) in Cha05j. This name arises from the fact that the Clifford group needs to preserve 
the commutation relationships between the tensor-product Pauli. That, in turn, means that 
it needs to preserve the symplectic inner product as it specifies the commutation relation 

(E3D- 

Theorem 5.2.5. Cha05j Twirling a Pauli superoperator Ap by the symplectic group 
SL(D) turns it into a unitarily invariant superoperator Au- 

Proof. Using the argument from DLT02J, we note that the symplectic group will map 
each non-identity Pauli Pj, j > 1, equally often to w l d Vj for all / = 1, 2, . . . , d as it is the 
coset group of the normalizer of the Pauli group V(D). We need the identity that for all 
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P e L(H), 

D 2 D 2 D 2 

3=1 3=1 1=1 

3=1 1=1 

í=l i=l 

= D 2 tr(p t l)l = D 2 trpl 

where we used Lemma 15.2.31 in the same way we did in the proof of Lemma 15.2.41 and that 
V{D) forms a basis for L{H). 

Now a calculation shows that for all p G L(7í), 



^ = jsm E cwpoc 



CeSL(D) 

D 2 



1 V n C€SL(D) j=l 



1 D 2 / D 2 \ 
3=2 \Z=1 / 



\ 1=2 / / \ 1=2 

This structure of Au shows that it is unitarily invariant. □ 

Lemma 5.2.6. Let /i be a probability measure on U(D) and let A be a superoperator. 
Define the /z-twirled superoperator 



JU(D) 
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If A M is unitarily invariant, then A M = where A^ is the Haar-twirled superoperator (see 
DefinitionESUSD- 

Proof. Using the unitary invariance of A M and the normalization of the Haar measure, it 
follows that 



(A M ) T = / ÚkjfldU 

JU(d) 



A^dU = A M . 



U{D) 

Using the unitary invariance of A^ from Lemma Í2 . 3 . 1 71 and the normalization of the prob- 
ability measure /i, we have 



(A T ) = / VA T VUfi(V) 

Ju(d) 



U(d) 

A T du.(V) = A T . 

U(D) 

The linearity of the integral ensures that we can change the order of integration. Together 
with the unitary invariance of the Haar measure, we see that the order of twirling does not 
matter. 



(A M ) T = / ÜAjfldU 

JU(D) 

Ü I vkv*dy.{y)Ü*dU 

U(D) Ju(D) 

ÜVAV^Ü j dfi{V)dU 



U(D) JU(D) 

f I 

U(D) JU{D) 



VÚ'AÚ' í V j! dn.(V)dU' 



I V í Ú'AÚ^dU'V^dn{V) 

JU(D) Ju(D) 



U(D) JU(D) 

VA T V^dfi(V) 



U(D) 



(A T 



where we used the change of variables U = VU'V^ for the fourth line. Therefore A 
A T . □ 
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Definition 5.2.7. Define 

Sum D (p,A) = — tfA{Cptf)C 

and 

Int D (p,A) = /" U j A(UpU j )UdU. 

JU(D) 

Observe that both Suni£)(p, A) and Int£>(p, A) are linear functions on L(7i) for fixed A. 

Theorem 5.2.8. Twirling a superoperator A by C(D) is the same as Haar-twirling. For- 
mally, for all linear operators p, we have that 

Sum D (p, A) = Int D (p, A) = pp + (5.7) 

Proof. Lemma [5.2.41 and Theorem 15.2.51 in conjunction with Lemma [2.3. 131 show that 

„ / A \ trp^ 
Sum D (p, A) = pp + 

for some parameter < p < 1 . Corollary 12.3.141 shows that 

Int d (p, A) = pp + q tr p-^ 1 

for some constants p', q' . Lemma [5.2.61 shows that p = p',q = q'. □ 
Corollary 5.2.9. For any M,N 6 L{H), 

— l — V CMC \ CMC I U j MUNU j M j UdU. (5.8) 
|C(jD)I ceC(D) Ju ^ 

Proof. We consider the superoperator A(p) = MpM^ . ()5.8|) follows directly from ()5.7|) by 
looking at A(iV). □ 

Lemma 5.2.10. For any Hermitian M,N,0 E L(7í), 



V CMlCNC^OC = í UMUNUMOdU. 



(5.9) 
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Proof. Fix an arbitrary Hermitian N £ L(7í). Define the operators 

Sum'(M,0) = — 3— C ] MCNC ] OC 



1 V n C&C(D) 



and 

Int'(M, O) = I U ] MUNU ] OUdU. 

JU(D) 

Then (15 .8)) reads 

SW(M, M) = Int'(M, M) (5.10) 
where we used that M, N, O are Hermitian operators. Furthermore, we can see that 

Sum'(M, O) = Sum'(0, M) f and Int'(M, O) = lnt'(0, M) f (5.11) 

as M, N and O are Hermitian. 

We can extend ()5.8j) to work with two different operators by considering M\ = M + O 
and M 2 = M + iO. From IjOTty . we get Sum'(M,-, M,-) = Int'(Mj, Mj), j = 1,2. By the 
bilinearity of both Sum' and Int', we can expand both sides for j = 1,2 and subtract (|5.10p . 
Using (|5.11|) . we end up with 

Sum , (M,0) + Sum'(M,0) t = Int'(M, O) + Int'(M, 0)\ (5.12) 
zSum'(M,0) -zSum'(M,0) t = i Int'(M, O) - i Int'(M, 0) f . (5.13) 

Observe that z (lo~T2J) + (EHÏÏl) yields 

2* Sum'(M, O) = 2i Int'(M, O) 
and (|?T5jl follows. □ 
Lemma 5.2.11. (J5.9j) holds for any Hermitian N and all M, O £ L(H). 

Proof. Sum'(M,0) and Int'(M, O) from the previous lemma are bilinear forms on L(7í). 
Thus the construction in the proof of Corollary 14.2.61 applies and ()5.9|) holds for any linear 
operators M, O and all Hermitian N. □ 

Lemma 5.2.12. Q holds for all M,N,0 £ 
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Proof. In a last step, we extend iV in ()5.9j) from Hermitian to any linear operator. We fix 
M,0 G L(H) and define 

Sum"(iV) = — L- J2 CftMCNCftOC 
' ^ ceC(D) 

and 

Int"(jV) = / U ] MUNU ] OUdU. 

Ju(D) 

It is immediate that Sum"(iV) and Int"(iV) are linear in N for all N e L(7í). Sum"(N) = 
Int"(iV) only holds for Hermitian N, but can be extended to all N e L(N) by linearity and 

the fact that we can express the canonical basis {|/c)(/|} fc l=1 for as linear combinations 
of Hermitian operators: 

\k)(l\^^(\k}(l\ + \l)(k\) + l -i(\k)(l\-\l}(k\), 

where \k)(l\ + \l)(k\ and i(\k)(l\ — \l)(k\) are Hermitian. □ 

Corollary 5.2.13. C(D) = SL(D)oV(D) is a unitary 2-design. 

This concludes that twirling by the Clifford group yields a unitary 2-design. However, 
it is not clear how to uniformly randomly sample from the Clifford group and only a 
randomized algorithm is known so far for the case of qubits, i.e. D = 2 N |DLT02j . This 
algorithm uses 0(N 8 ) classical steps and produces a circuit of size 0(N 2 ), where the 
distribution is close to the uniform distribution over C(D) in the Zx-norm. We will see in 
the second section how this approximation shows up in the 2-design condition. 

5.2.3 Efficient Approximate Construction 

In this section, we will prové that a subset of the Clifford group C(D) already gives an 
approximate 2-design in the induced superoperator norm. Our construction also only works 
for qubits, thus we also assume D = 2 N here. 

Theorem 5.2.14. For any e > 0, twirling a Pauli superoperator Ap by a subset SL e (D) Ç 
SL (D) of the symplectic group turns it into a superoperator A e that such that 

||A e - Aullo < B(A) (e + e) 

for eo = 2 ív_ 1 2 _ J v and Au the unitarily invariant channel from 15. 2 31 The norm is the induced 
operator norm from L{H) and the parameter B(A) will be determined later. 
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The circuits in SL e (D) consist of 

^iVlogi 

single and two-qubit gates in depth 

O ^log N log - 

and the constructions needs 

O ^iVlogi 
random bits. The subset is of size 

\SL e (D)\ =2°( Jvlos 7). 

Proof. The task is to find a subset C e that uniformizes the tensor-product Paulis with high 
probability, i.e. that maps a non-identity tensor-product Pauli to any tensor-product Pauli 
with almost equal probability. 

We can choose using suitable phase factors for the Pauli operators as they are ir- 
revelevant in the Kraus operator-sum representat ion ^2 p PpP^ for they will cancel out. 
Therefore, a typical Pauli will look like the following: 

<t x <g g z <g o y <g 1 <g cr z <g 1 (g 1 <g o-y (g o- z 

(a) Basic Building Blocks In general, we consider a tensor product of Paulis that is 
not equivalent to the identity, thus at least one component is not 1. We can use the cyclic 
shift gener ator T = HP, where 

H=\+)(+\-H(-\ 

is the Hadamard gate and 

P=|0)<0|+i|l)<l| 
is the phase gate. Ignoring global phases, we see that 

Ta x T^ = <7 y , 
ToyT^ = a z , and 
Ta 2 Tt = a x 

using the convenient tjj notation for a single-qubit Pauli. We thus have that T, T 2 , and 
T 3 = 1 generate permutations of a single-qubit Pauli, which will be used as a building 
block later in the construction. 
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How to Twirl Two Single-Qubit Paulis Now notice that we can conjugate pairs of 
Pauli operators in the tensor product by a CNOT to create or annihilate identities. To 
see this, realize that the action of a CNOT is 

CNOT(X ai Z bl ® X a2 Z b2 )CNOT ] = X ai Z bl ~ b2 ®X a2 ~ ai Z b2 

where the minus in the exponent is chosen to stay consistent with the general qudit case. 
Hence we create identities if a x = a 2 = 1 and b 2 = 0, with some back-action that will 
modify the Pauli on the control qubit. We will take care of that later and note that we 
will use either the X a Z b or o a ^ notation, or even shorter cr, where i G Z 4 = Z 2 x Z 2 . 

(b) Step 1: How to Generate a o x or o y with Constant Probability We want to 
use this construction to generate a tensor-product Pauli where a specific component has an 
X or Y Pauli with constant probability. We can reduce this to the much simpler problem 
of a binary string x G {0, 1}^ that is guaranteed to have at least one 1 and we can change 
a pair of positions by a controlled-NOT operation in the following way: 



X 


CNOT{x) 


00 


00 


01 


01 


10 


11 


11 


10 



This is the abstraction of conjugating a tensor-product Pauli by CNOT gates if we identify 
X a Z h with a. 

Now we can make use of the well-known fact that for x G {0, 1}^, x ^ N , 

Pbe{o,i}»(b-x = 1) = ^, 

where b ■ x = X^li ^í x í m °d 2. We restrict b ^ Q N and observe that 0^ • x = for all 
x G {0, 1}^. Hence 

Pbe{o,i}Njb f to*'(b·x= 1) > - 

and we define 

1 

Psucccss 2 ' 

Pick a non-empty subset B Ç {1,2,..., N} uniformly at random and apply the CNOT 
conjugation to the first bit position in B from all the other positions in B. Then the first 
bit in B will be 1 with probability greater than 1/2. 
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Going back to the tensor-product Paulis, this result implies that picking a random 
target qubit and conjugating with CNOT gates from a random subset of the remaining 
qubits as controls will guarantee that the target qubit has an X or F Pauli. 

(c) Step 2: How to Generate an Almost Uniform Distribution over V We will 

now pick the target qubit from step 1 as our control qubit. Then, we apply a single-qubit 
T Si on each other qubit, for independently and randomly chosen G {0, 1,2}. This will 
uniformize all non-identity Paulis on the target qubits. After that, we will independently 
twirl with a CNOT gate on each of the other qubits as target and controlled on the control 
qubit, with probability 3/4 each. We assume that the control qubit has an X or Y Pauli, 
which is guaranteed to happen with probability greater than 1/2 by the previous step. 

Consider a target qubit t. This either has the 1 operator or X, Y, or Z with probability 
1/3 each. Observe the effects of a CiVOT-twirl that is applied with probability 3/4. 



Target 


Result of CWOT-Twirl 


Probability 


1 


1 


1/4 


1 


X 


3/4 


X 


X 


1/4 


X 


1 


3/4 


Y 


Y 


1/4 


Y 


Z 


3/4 


Z 


Z 


1/4 


Z 


Y 


3/4 



We see that if the Pauli on t was 1, it will be twirled to a non-identity Pauli with probability 
3/4. If the Pauli was not 1, we see that it is twirled to an identity with probability 1/4 and 
stays a non-identity with probability 3/4. Once again, we insert a single-qubit twirl T Si 
on each target qubit for independently and randomly chosen Sj G {0,1,2} to ensure that 
the non-identity Paulis on each target qubit will have probability 1/3 each. The circuit 
generated so far is shown in Figure lïïTTl Note that by the back-action of the CNOT twirl, 
it might change the control qubit Pauli from X to Y or vice-versa. 

The next step is to randomize the control qubit. We note that all other qubits already 
have a uniformly chosen tensor-product Pauli on them, with probability l/4 n_1 each. 

Observation: If we apply any permutation of Paulis to these N — 1 target qubits, it 
will not change their distribution for it is already uniform. 

We randomize the control qubit Pauli by a P-twirl with probability 1/2. Note that P 
permutes X and Y. As the control qubit Pauli starts out in an unknown distribution of X 
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Figure 5.1: T-Twirl and CNOT Gates with Probability 3/4 each from a Randomly Chosen 
Control 



and Y caused by the back-action of the previous step, the random H twirl will set it to X 
or Y with probability 1/2 each. Now we apply a random CNOT with the former control 
qubit (which we will call the "first" qubit from now an) as target and controlled by each 
other qubit. Each CNOT is applied with probability 1/2, and we see that the back-action 
that might modify some of the now-control qubits does not affect the distribution over the 
Paulis on these qubits. The observation shows that the uniformity is not changed by a 
permutation caused by possible back-actions. 

This procedure will sucessfully randomize the first qubit as it will change it from X or 
Y with probability 1/2 each to Z or 1 with probability 1/2 each, where a change occurs 
with probability 1/2 if at least one of the other qubits has an X or Y Pauli. After that, 
it will be one of the four Paulis with probability 1/4 each and other permutations by 
later CNOT twirls will not change this uniform distribution due to our observation. The 
complete circuit of step 2 up to here is illustrated in Figure 15.21 

However, the distribution obtained so far will not have any weight on the tensor-product 
Paulis that are 1 on the first qubit and 1 or Z on the other qubits, as this case prevents 
any change to the first qubit and it remains X or Y with probability 1/2 each. Adding a 
T J -twirl will at least randomize between X, Y, or Z. Thus the only non-reachable tensor- 
product Paulis are those with 1 on the first qubit and 1 and Z on the other qubits. The 
sample circuit for N = 5 in Figure IB~ÏÏ1 illustrates the complete twirl. 

We will now show that this part of the random procedure will generate an almost 
uniform distribution over all possible tensor-product Paulis except the all-identity Pauli. 
For a precise estimation, we will consider the Zi-distance between the uniform probability 
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Figure 5.2: Step 2 of the Pauli Uniformization Process, Highlighting the Random CNOT 
Parts. 
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distribution 



U X 



4^TÏ X ^ O 

x = o 



on all Paulis but the identity and the distribution g(x) obtained by this process. In case 
the tensor-product Pauli we started with had an X or Y at the randomly chosen control, 
the process will produce an almost uniform distribution q, which can be seen in Figure lS~4l 
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4iV_2 iv - 








1 








4 N_ 1 










1 ® 







Figure 5.4: The Zi-distance between u and q. 
Precisely, the /i-distance between u and q is given by 
ll^-çll·i = 



Yl i n ( x ) -^( x )i 

_i i ^ f4^ _ 2*-^ + - (2 N - 1 - 1) 

^jv _ 2-W-i 2 7V_1 — 1 

1 



4* - 1 4* - 1 



2(2 jv-i -1) 2 N 



A N -1 ~ A N -1 2 N - 2~ N 
which is exponentially small in the number of qubits. 



eo, 



(d) Further Uniformizing the Distribution In the bad case where our choice of 
control will pick a qubit with an 1 or Z, we use the fact that our random circuit is a 
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probability distribution over permutations over the tensor-product Paulis. Denote p the 
initial distribution over the tensor-product Paulis after the the random chain step. We have 
that for any permutation 7r acting on a probability distribution p that ||7rop — u\\ = \\p — u\\ 
for any distance || • || as permutations will only permute the probabilities in the distribution, 
which has no effect on any norm. With a probability distribution r over permutations 7Tj, 
we see that 



for any distance || ■ || using the triangle inequality and that J2i r ( n i) = 1- This is the 
convex-linearity of the distance. 

This argument shows that in the case where we do not have X or Y on the control 
qubit, the process will not increase the distance to u. Denote d = \\p — u\\, and by the 
convex-linearity of the distance we see that 

d\ — í*succcss 1 1 Q U\\ -\~ (1 Psuccess ) 1 1 V ^11 Psuccess 1 1 Q w|| -|- (1 Psuccess ) • 

We can define the following recursion for the decrease in distance after steps 1 and 2 have 
been applied to an initial distribution p' . Denote dk the distance to u after step k. The 
/i-distance for any k > is given by the recursion 

dk+i < P(Step 2 works for x)||g — u\\i + P(Step 2 does not work for x)<ifc 



^r(7r í )7r i op - u 



y^ j r{'Ki){n o P-u)\\ < ^r(7r i )||7r i op 



u 




d fc>lk-«l|i 
< 





as p S uccess is a lower bound for P(Step 2 works for x). If dk < ||ç — we cannot use that 
lower bound but have to step back to 



4+i < Pi 



success 



q - u\\i + d k < (p. 



success 



+ l)||g - ií||i < 2\\q - uHi. 



Thus are analysis can only guarantee a bound twice as high, after which we are in the 
regime of ()5.14j) again. 
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We can solve ()5.14j) analytically and see that 

fc-i 

d'k ^ Psuccess^O ^ (1 Psuccess) ' (1 Psucccss) ^0 
i=0 

1-(1-Ps 



i=0 

. 'success 

Psuccess^O 



y success 



1-(1-Ps 

C0 íl - (1 - Psuccess)^ + (1 - £> success ) fc d 



= e + (1 - í»succcss) fc (eo + do) 
< e + (l-p success ) (eo + 1) • 

In order to be e-close to eo, we need 

(1 -p succcss ) fc (e + 1) < e 

which implies 

lQg(6 + 1) + lQg i 
\°&— D 

L /^success 

and thus 

k — O [\og ^ 

(e) Optimizing the Circuit Complexity The random subset chosen in step 1 requires 
exactly N random bits and at most — 1 CNOT gates in depth at most N — 1. This can 
be optimized using techniques employed by paral·lel prefix adders. Suppose we were to to 
map 

\xi)\x 2 ) . . . \x N ) H-> l^l)^! © x 2 )\x 1 © x 2 © x 3 ) . . . \x\ © x 2 ® ■ ■ ■ © x N ). 

Using a parallel prefix computation circuit from classical computation LF80 , we can 
decrease the depth to [logiV] using at most 4N CNOT gates. Figure lïïTol shows what the 
parallel prefix adder looks like for N = 16. 

For our purposes, we only need to compute the parity of at most iV qubits and do 
not need the partial sums. This is similar to a parallel prefix circuit, except that we do 
not need the prefixes. Thus we generate the CNOT circuit first using the appropriately 
chosen subset. We will end up with a circuit of CNOT gates from r qubits onto one qubit, 
which we will call the "last qubit". Then, we transform this circuit into a parallel prefix 
circuit, but we only consider gates that affect the last qubit and ignore the other gates. 
We are left with a circuit C that is half of the circuit from Figure 15.51 highlighted using 
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Figure 5.5: A Parallel Prefix Adder for 16 Qubits. 
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a dashed rectangle. Finally, we need to uncompute the intermediate results on all but 
the last qubit. This can be accomplished by applying the CNOT gates in C that do not 
involve the original control again in their reverse order. This yields an equivalent circuit 
of depth O(logiV) and O(N) CNOT gates. 

The circuit for step 2 uses up to 2(N — 1) CNOT gates, N single-qubit gates 1, T, or 
T 2 , and single H gate to uniformize X and Y. It has depth 2N and uses 2 random bits 
per CNOT for the first part of CNOT gates to get probability 1/4 and it uses a single 
random bit per CNOT in the second part. The X-y-uniformization costs a single random 
bit, and the N T l twirls cost ^j|y random bits. This gives a total of O(N) gates in depth 
O(N) using O(NlogN) random bits. 



- H -J— H - 



-9- 



- H -Q- H - 



Figure 5.6: Hadamard Conjugation Flips Targets and Control of a CNOT gate. 

We can optimize the depth for this well. Once the circuit has been established, 

we can transform both CNOT parts. For the first part, we use that conjugating a CNOT 
with H ® H swaps control and target as seen in Figure 15.61 Applying a Hadamard to all 
qubits before and after the first CNOT part swaps the controls and targets of all CNOT 
gates, using HH = 1 between individual two CNOT gates. This conjugation by Hadamard 
gates is illustrated in Figure 15771 
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Figure 5.7: Hadamard Conjugation Flips Targets and Control in Circuit l5~TI 
Using the same construction as in the optimization of step 1, we can also reduce the 



106 



Random Quantum States and Operators 



depth to O(logiV) and O(N) CNOT gates. Counting the necessary 2N Hadamard gates, 
we end up with O(N) gates. Note that the second part of CNOT gates (see Fignre 
can be directly transformed using this method, without conjugating by Hadamard gates 
as they already have the correction orientation. 

These optimizations thus reduce the depth of step 1 and 2 to O(logiV) and retain the 
circuit complexity of O(N) H, T, T 2 and CNOT gates. We need to repeat both steps 
O (log -J times and the total complexity follows. 



(5) Error Bound To bound the error, we consider Ajj that we would end up with had 
we perfectly symmetrized the Paulis. We assume the initial Pauli channel 

D 2 

A(p) = Y, a i p iP P i 

k=l 

and denote the probability distribution of our conjugation process on Pj by f3jk for k = 
2,3,...,^. 
Recali that 

||A||o= sup ||(A<8> l)(p)|| 
IIpIIi=i 

and observe that our conjugation process will not change the all-identity tensor-product 
Pauli. Thus 

D 2 D 2 D 2 D 2 

<** Ï2 ^ Pk ® 1 )'°( Pfc ® 1 ) t - Yl a í Yl u ( k )( p k ® !)p( p fc ® i ) t 

3=2 k=2 



|A e — Atri 



sup 

IIpIIi=i 



sup 

llplll=l 



j=2 k=2 
D 2 D 2 



j=2 k=2 



D 2 



< sup w - IK Pfc ® ® i) f lli 

HHIi =1 3=2 k=2 
D 2 D 2 

= sup J^K'lX^ \Pó,k-u{h)\ \\p\\ x 

\\Ph= l j=2 k=2 

D 2 D 2 



sup 

D 2 



\ a j\ \\P\ 



k=2 



J=2 



a 
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where we used that \\(Pk ® l)p(-Pfc ® l)^^ = IIpIIi f° r -Pfc®! is unitary and || • ||i is unitarily 
invariant. □ 

We need to take the error bound into account to further derive the unitary 2-design 
condition. As the input for Theorem 15 . 2 . 1 41 is the Pauli-twirled superoperator from Lemma 
15.2.41 Taking the completely-positive before the Pauli twirl to be 

Hp) = J2 Ak P A l 

k 

we see that = 4$ J2k |cüfcj| 2 , where ^2<^k,jPj = ^k such that 

Using the fact that {-4=i^ | j — 1, 2, . . . , D 2 } is an orthonormal basis, we conclude 



D 2 
J=2 



J=2 fe 



£3 (ZZ 



k j=l 



E 



4b, 



^(i>(^)-i£|tr4i| 2 ) 

\ fe kJ 



— (DtrA(l)-trA 



using the formulas for trA(l) and trÀ from Theorem 12.3.181 in conjunction with the cal- 
culations in the proof of Corollarv 12.3. 191 
This yields the bound 

. ii DtrA(l) - trÀ . . 
||A e - Acllo < y (eo + e) 



We have that 



B(A) 



D A 

DtrA(l) - trÀ 
D 1 ' 
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5.3 Discussion 

We introduced the notion of a 2-design and showed that such an object was already used in 
the quantum information theory literature for bipartite state twirling |DLT02j . However, 
it did not seem to be known that quantum operations can be twirled using the same object. 

We note that the private quantum channel result AMTdWOOj shows that the tensor- 
product Paulis V(D) satisfy 



or all states p. It can be extended to all linear operators p by linearity and the fact that 
the Hermitian operators form a basis for L(7í), using the same arguments as in Section 
15. 11 Hence we can see that this condition is equivalent to 



for all homogeneous polynomials of degree (1,1), which we will call a unitary 1-design. 

In the abstract formulation of Defmition 15.1.31 it might turn out to be useful in a 
broader context where Haar-randomization can be reduced to randomization over a fairly 
small set of quantum gates that have efficient circuit decompositions. Such application 
beyong twirling are yet to be found or identified. 





Chapter 6 

Conclusion and Future Research 



6.1 Conclusion 

We explored ways to efficiently estimate the average fidelity of a quantum channel or 
an implementation of a quantum algorithm U. It turned out triat we re-discovered the 
previously known result that a complete set of mutually-unbiased bases gives a 2-design 
for quantum states. This condition was shown to be equivalent to give an unbiased estimate 
of the average fidelity. Our contribution was an explicit circuit construction using 0(N 2 ) 
gates in depth O(N) and O(N) random bits. 

Then, the notion of a 2-design for quantum states was generalized to 2-designs for uni- 
tary operators. Although the term "2-design" did not seem to have appeared before in the 
literature, the concept was implicitly used as early as at least 1996. It was independently 
proven |DLT02| ICha05j that the Clifford group is a unitary 2-design by showing its use 
for quantum operation twirling and state twirling. An approximately uniform sampling 
algorithm over the Clifford group was proposed as well. Our contribution is the unified 
view of these different approaches as unitary 2-design. Also, we showed that a subset of 
the Clifford group suffices to be exponentially close to a 2-design for both applications. 

We have also seen that we can define the notion of a unitary 1-design, which was 
already implicitly shown to exist AMTdWOOj in the context of a private quantum channel. 
Interestingly, the Pauli group was this unitary 1-design. 
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6.2 Directions for Future Research and Open Prob- 
lems 

6.2.1 Definition of Unitary í-Designs 

For future research, there are several areas to proceed in. First and foremost, it seems 
apparent to extend the notion of unitary designs to t-designs for arbitrary t. A possible 
application can be noise estimation scenarios where the time evolution of the average 
fidelity is of interest, as suggested in EAZ05J. Imagine we model the evolution of our 
quantum systems from time t to t 1 by the quantum operation £ 1; and from t 1 to t 2 by the 
quantum operation £ 2 . 

Figure 6.1: Twirling two Successive Quantum Operations 

Suppose we twirl both operations with the same unitary as shown in Figure lïïTTl so that 
we end up with the operation £' that maps 

£'(p)= í US 2 {U ] 8 l {U P U ] )U)U ] . 

JU(D) 

The integral contains three occurances of each U and U\ what suggests that estimating 
this integral requires a unitary 3-design. 

Estimating the fidelity decay with even finer temporal resolution seems to require 
higher-order unitary designs, so that the quest for unitary í-designs can be motivated 
from this experimental point of view. 

6.2.2 Proof Idea for Unitary í-Designs 

We will present a proof technique that might be useful in generalizing 2-designs to t-designs 
for t > 2. It is based on representation theory and similar in spirit to the decomposition 
lemma of a unitarily invariant superoperator ÍLemma 12.3.13(1 . This technique might be 
useful in determining subgroups of U(d) that could serve as 2-designs other than those 
already discovered. We will state where these ideas need to be extended and why they do 
not work yet. 

Let TC denote a Hilbert space of dimension d, and let G < U(d) be a subgroup of the 
unitary group U(d) such that an invariant measure exists on G. 
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Definition 6.2.1. We will use the representation U : U(d) — ^ L(L(7Í)) from Definition 
12.3. 12| which was defined as 

Üp = UpU ] 

for all U G U(d). Note tliat this is also a representation of G. 

Furthermore, we will call a linear operator p G L(7í) G-invariant if VpV^ = p for all 
V G G. 

Recali the definition of [/(<i)-invariance ÍDefinition 12.3. 12*)) . which becomes a special 
case of Definition 16.2.11 if G = U(d). We extend Definition 12.3.161 to G-twirling in the 
obvious way. 

Definition 6.2.2. Let A be a superoperator. Define the G-twirled superoperator as 

A G (p) = I vkV ] dVp = í V ] A{VpV ] )VdV. 
Jg Jg 

Lemma 6.2.3. The G-twirled superoperator Ag is G-invariant. 

Proof. The proof is similar to the proof of Lemma f2 .3.171 Let U G G and p G L(7í). 

ÜA G Ü ] p = U ] I V ] A(VUpU ] V ] )VdVU 
Jg 

= I (VUyA((VU)p(VU)ï)(VU)dV 
Jg 

= [ (vyA(Vp(vy)Vdv 

Jg 

= Ag(p), 

where we used the G-invariance of the measure dV on G and the substitution V = VU . □ 

The following lemma is the critical point of this technique and needs to be shown. 

Conjecture 6.2.4. If the irreducible representat ions of U of U(d) are also irreducible for 
G, then Ag is Í7(d)-invariant. 

Proof idea As the irreducible representat ions for U(d) are also irreducible for G, Schur's 
Lemma ÍFact |A~8.5|I implies that A G will act as identity on the same subspaces as A^. 

In the case of t = 2, these irreducible subspaces are known from Lemma Í2.3. 131 as the 
traceless linear operators and múltiples of the identity, possibly with different coefficients 
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than At would. For this decomposition, we easily see that A G is unitarily invariant by 
calculating 

ÜA G Ü ] p = U ] p \UpU ] - ti(UpU ] )-\ U + U ] qti(UpU ] )-U 
\ d J d 

= P íp- trp^j +çtrpi = A G (p). 

Thus Aq is [/(c?)-invariant. 

However, to extend this to general t, we need to be able to show that Aq is unitarily 
invariant either without knowing the explicit decomposition into the irreducible represen- 
tations or by making use of this explicit decomposition. 

If we had this lemma, we could show the following Corollary. 

Corollary 6.2.5. For any superoperator A, 

VAV ] dV= i ÜAV. 

G JU(d) 

Proof. The invariant measure dV on G can be trivially extended to a probability measure 
p on U(d) by letting J E dp = J EuG dV. Lemma 15.2.61 implies that A M = A G = A T and the 
statement follows. □ 

The corollary that G is a unitary í-design could be proven the following way: First, we 
could use Conjecture 16.2.41 and thus we need to show that the irreducible subspaces of U 
for U (d) remain irreducible for G. 

In order to prové that the Clifford group is a 2-design, we could use 12.3.151 and show that 
the space of traceless Hermitian operators is irreducible under the Clifford group. However, 
we were not able to show that. Maybe the Clifford would turn out to be a í-design for 
t > 2. This should be subjecr of future research. 

It also seems to be worthwhile looking into the random circuit construction (see Section 
I2.3.7j) again and figure out how unitary í-designs might be derived using this approach. 



6.2.3 Find a Better Approximate Pauli Uniformization 

So far, the Pauli uniformization has an absolute lower bound of eo ~ 1/2 N from Theorem 
15.2.141 Maybe one could improve the analysis to get an arbitrarily small upper bound on 
eo, or one might choose a slightly larger subset of the Clifford group which facilitates that. 
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6.2.4 Extend the Approximate Pauli Uniformization 

The construction in Theorem 15.2.141 only works for qubits. In order to make it work for 
qudits as well, we need to find an analogy to the generator of the single-Pauli twirl T = HP. 
Assuming d > 3 a prime, we could use the special phase gate 

j-,d\ \ rx 2 /2, v 

for r e FJ we can conjugate 

p d , r x a z\p?) ] = x a z h - aT ~ x 

and hence we can uniformize the Z component provi ded a ^ 0. 

In order to uniformize the X component, we could make use of the Quantum Fourier 
Transform modulo d, which is given by 

F d \x)=J2"7\y)- 

yeF d 

It acts on the Paulis by conjugation as 

F d X a Z h F\ = X- b Z a , 

which implies 

FlX a Z\F'l) ] = X b Z a . 

This allows us to uniformize the X component by conjugating with F^P d ^ r Ff as long as 
the X component is non-zero. 

The problem is the case where either the X or the Z component is zero, as those will not 
be reached by one of these two randomization steps. However, it seems conceivable that 
an alternating chain of conjugation by F^P d , r F d and P dtT could create an almost uniform 
distribution over all non-identity Paulis. Maybe one could even find a generator of a cyclic 
shift that is analog to T in the qubit case. 

The next step is to find an analogy for the CNOT operation. Observe that the obvious 
generalization of the CNOT is CPLUS d , which is defined by CPLUS d \x)\y) = \x)\y + 1) 
where addition is modulo d. It conjugates 

CPLUS d {X ai Z bl <g> X a2 Z b2 )CPLUS ] d = X ai Z bl+b2 <g> X a ' 2 - ai Z b2 

and thus seems a reasonable candidate for further investigat ion. 
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If that would turn out not to be a good choice, one could try the generalization 
CSUMf\x) \y) \x) \y) = \ax + (5y) \/3a - a y ), 
which is reversible if a, f3 e and (a, f3) ^ (0,0). It can be shown that 

CSUM a >P) d (X ai Z bl X a2 Z b2 )(CSUM^)y = X^T^ Z abl+l3h2 X^T^Z^ 1 "^ 2 , 
which might be more suitable than CPLUS d . 



Appendix A 

Mathematical Background 



This appendix is intended to be a reference for the mathematical concepts and notations 
used throughout this thesis. See |Bal98j for an introduction to the bàsic concepts of 
linear àlgebra in both finite and infinite-dimensional settings that is streamlined to the 
description of quantum mechanics. For the fmite-dimensional case of quantum computing 
and quantum information theory tòpics, NCOÜ] is the most suitable reference to date. 

A.l Vector Spaces 

Definition A.l.l. A vector space over a field F is a set V together with two binary 
operat ions 

• vector additíon: V x V — > V, written u + v with u, v G V and 

• scalar multiplication: F x V — > V, denoted by au with a G F, u G V 
such that the following axioms hold: 

1. Associativity of vector addition: u + (v + w) = (u + v) + w for all u, v, w G V 

2. Commutativity of vector addition: u + v = v + u for all u, v G V 

3. Existence of an additive identity o G V such that u + o = u for all u G V 

4. Existence of an inverse vector — u for all u G V such that u + (— u) = o 

5. Associativity of scalar multiplication: a(bu) = (ab)u for all a, b G F and u G V 

6. lu = u for all u G V 
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7. Distributivity of scalar multiplication over vector addition: a(u + v) = au + av 

8. Distributivity of scalar multiplication over scalar addition: (a + b)u = au + òu 

The elements of u G V are called vectors and the elements of F are called scalars. 

Definition A.1.2. A real vector space is a vector space over the real numbers. A complex 
vector space is a vector space over the complex numbers. 

Definition A.1.3. A subspace W of a vector space V is a subset that is closed under 
vector addition and scalar multiplication. The intersection of all subspaces that contain a 
given set of vectors S is called the span of S. A set of vectors S = {vi, . . . , v n } C V is 
called linearly independent if 

aiVi H h a n v n = 

has only the trivial solution a± — • • • — a n — 0. S is called a òaszs if the span of S is V. 

Every basis for a vector space V has the same cardinality which is called the dimension 
of V. All vector spaces over a given field F of the same dimension are isomorphic. 
Sometimes it is helpful to write a vector space as a sum of some of its subspaces. 

Definition A.1.4. Let V and W be vector spaces over a field K. The dírect sum of V 
and W is the Cartesian product V x W endowed with the vector space operations 

1. (vi, vwi) + (v 2 , vw 2 ) = (vi + v 2 , Wi + w 2 ) for all Vi,v 2 G V, Wi,w 2 G W and 

2. a(v, t>w) = (av, aw) for all a G ií, v G V, w G W. 

The resulting vector space is called the direct sum of V and W and written as V © W. 

Definition A.1.5. Let V be a vector space over a subfield F Ç C of the complex numbers. 
A norm on V" is a function | • | : V — ► R such that the following properties hold: 

1. Positivity: |v| > for all v G V 

2. Positive scalability: |av| = |a||v| for all a G F, v G V 

3. Triangle inequality: |u + v| < |u| + |v| 

4. Positive definiteness: |v| = iff v = o 

Definition A.1.6. A normed vector space is a pair (V, \ • |) such that V is a vector space 
and I • I is a norm on V. A vector v G V is normalized if Ivl = 1. 
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Definition A.1.7. A function from a vector space V to a vector space W over the same 
field F, / : V — > W, is called imear if 

• /(u + v) = /(u) + /(v) for all u, v G V and 

• /(au) = a/(u) for all a G F, u G V. 

A function / : V x W 7 — > X for vector spaces V, W, X over the same field F is bilinear if 

1. v i— > /(v, w) is linear for every w G and 

2. wh /(v, w) is linear for every v G V. 

A function / : V x W — > X for vector spaces V, W 7 , X over a subfield FÇCof the complex 
numbers is sesquilinear if it is bilinear except /(av,w) = a/(v, w). 

Definition A.1.8. Let V, W 7 be normed vector spaces with norms | • |y, | • |w- This norm 
induces a norm on the set of linear operators L(V, W) from V to W defined as 



We will call || ■ || : L(V, W) — > R the induced operator norm on L(V, W 7 ). 

Definition A.1.9. A complex inner product space is a vector space V over C together 
with a map (•,•): V x V — > C such that 

1. (•, •) is sesquilinear, 

2. (u, v) = (v, u) for all u, v G V, 
3- (u, v) > for all u, v G V, and 
4. (v, v) = iff v = o for all \ eV . 

Definition A.1.10. Let V be a complex inner product space. Two vectors u, v G V are 
orthogonal if (u, v) = 0. 

Definition A.l.ll. For every complex inner product space V, there is a norm |v| = 
y/ (u, v). V is complete with respect to that norm if every Cauchy sequence converges to an 
element of that space. A complete normed complex inner product space 7i is called a Hilbert 
space. Note that in the mathematical literature, a distinction is made between complex 
and real Hilbert spaces, which are complete normed inner product spaces over the real 
and complex numbers, respectively. However, quantum computing literature understands 
a Hilbert space as defined above. We will use that definition throughout this thesis. 
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Definition A.1.12. An orthonormal basis for a Hilbert space H is a set S C H whose 
span is dense in 7í and whose elements are pairwise orthogonal and have norm one. 

Fact A.1.13. 1. Every finite-dimensional complex inner product space H is a Hilbert 

space. 

2. Every Hilbert space H has an orthonormal basis. Any two orthonormal bases of H 
have the same cardinality. 

Definition A.1.14. Let / be a sesquilinear function f : V x V <C for a vector space V 
over a subfield F Ç C of the complex numbers. Given u, the map v i— > f(u, v) is called a 
linear functional on V. The set of all linear functionals on V forms a vector space under 
addition of functions and scalar multiplication. It is called the dual space of V and denoted 
V*. 

A.2 Dirac Notation 

Paul Dirac introduced a convenient notation for Hilbert spaces that has been widely ac- 
cepted in quantum mechanics literature. This notation is sometimes referred to as "bra- 
ket" notation because the inner product of two vectors is denoted by a bracket (0, ip) or 
((f), ijj). The left part (cf>\, is called "bra", and the right part \ip) is called "ket". Let 7i be 
a n-dimensional Hilbert space. Most of the definitions also hold for infinite-dimensional 
Hilbert spaces as well, but they are not necessary for quantum computing. 

Definition A.2.1. Each vector in 7i is called "ket" and written as ip denotes the 
vector and the bar and angle bracket denote that it is to be understood as the vector if>, 
read "ket psi". 

Fact A.2. 2. For every e 7í, there is exactly one dual e 7í*, read "bra psi", which 
is a continuous linear functional from Ti to (D: 

<V|(|0)) = (|V},|0» for all \4>)eH. 

The converse is true as well as Tí and 7i* are isometrically isomorphic. 

Definition A.2. 3. A linear operator on 7í is a linear function from 7i to 7i. Operators act 
on kets from the left. Let Abea linear operator on a Hilbert space 7í, then A\ip) = A(\ifj)). 
Operators can also act on bras from the right hand side, such that ((f>\A is understood as 
the operator that acts as ({<f)\A)(\ip) = {<f>\ (A\ip)) = ((f)\A\ip). 
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Fact A.2.4. Let A be a linear operator on H. There is a unique linear operator such 
that 

(|0),^)) = (At|0),|^)) 

for all \4>), \ip) G 7i. If we define \(f>) as the linear operator that maps \4>){\ip)) = (01^), we 
have that |0)t = It follows that (A|0))t = (0|At. We also note that (AB)t = £ f At 
for A,Be L{H). 

Definition A.2.5. Let i be a linear operator on 7i. A is 

• invertible if there is an operator A -1 such that A o A _1 = A -1 o A = 1 is the identity 
operator on Tí. 

• Hermitian or self-adjoint if A = AL 

• normal if AA' = A'A. 

• unitary if AA* = 1. 

• positive if (V'IAIV') > for all G H. 

Fact A.2.6. A unitary operator U on 7i preserves inner products: 

(^),f/ t |0)) = (^|f/f/t|0) = ^|0). 

Fact A.2.7. Let B = ■ ■ ■ , |VVi)} be an orthonormal basis for 7i and let A be a linear 

operator on 7i. If we choose to represent vectors in Tí as column vectors with n entries in 
C, we can represent A by the n x n matrix (a it j) with elements a it j = (ipi\A\tpj) . 

Definition A.2.8. Let A G L(H), \ip) G and A G <D. |^>) is called an eigenvector of A 
with eigenvalue À if 

A|V) = A|V>. 

Fact A.2.9 (Spectral Decomposition Theorem). Let A be a normal linear operator 
on H. Then there is an orthonormal basis ■ ■ ■ , \ip n )} of H and À 1? . . . , \ n G C such 

that 

n 

A = 5>hA><# 
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Definition A.2.10. We define the "outer product" of two vectors |0) and as the 

operator 

(I0)(^I)(|X)) = |0)^|X) = (^|X)|0)· 

This outer product notation is generally used to define projection operators. Given a 
normalized \<f>) G Tí, we define the operator that projects onto the subspace spanned by 
|0) as |0) (0|. 

Fact A.2.11. The trace is the unique linear function 

tr : Tí i-> C 

such that 

• it is unitarily invariant, tr A — tr U AW for all linear operators A and unitary oper- 
ators U, and 

• tr 1 = n. 

Let A be a linear operator on Tí and let (ajj) be a matrix representation of A in some 
orthonormal basis. Then 

n 

i=i 

The trace function is well defined and does not depend on the specific representation of A. 
Furthermore, the following algebraic properties hold. Let A,BeTC, then 

• tr(AB) = tr(BA) (cyclic property) 

• tr A = tr A^ 

Fact A.2.12. The set of all linear operators on Tí forms an n 2 dimensional vector space 
and is denoted by L{TÍ). L(TÍ) is a Hilbert space with inner product (A,B) = trA^B. 
This inner product is called Hilbert- Schmidt or trace inner product. 

Definition A.2.13. Let V and W be Hilbert spaces of dimensions m and n, respectively. 
Then the tensor product of V and W, written as V <E> W, is an mn dimensional complex 
vector space. V <g> W = V x W and the vector addition and scalar multiplication satisfy 
the following restrictions: 

1. a(\il>) ® |0)) = (a\ip)) ® |0) = |^) ® (a|0)) for all |^) G V, |0) G W, a G C 

2- (|^) + |0)) ® |X) = |^) <S) |x) + 10) ® IX) for all |^), |0) G V, |x) G 
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3- |X>®(M + 



|X) ® |V> + IX) ® 10) for all |x) G V, |^), |0) G W 



The tensor product of two vectors \ip) <S> \4>) is most often abbreviated as \i/),(f>), or 

even |-00). 

Given linear operators A on V and B on W, we can define the operator A ® B hy 
letting 

(A®B)(\iP) ® \(f ) ))=A\iP)®B\(f ) ). 



Fact A.2.14. Let V and W be Hilbert spaces of dimensions m and ra. Then V <g> is a 
Hilbert space of dimension mn with inner product 



To make the discussion about tensor products a little more concrete, we will have a 
look at an example tensor product. Pick an ortho normal basis for Hilbert spaces V and 
W of dimensions n and m, respectively. Then we can represent their elements as column 
vectors. Let G V, |0) G W: 



^2 
W 



,10) 



/0i\ 



The tensor product of \ip) and |0) is given by the Kronecker product if we think of these 
vectors as 1-by-n and 1-by-m matrices. Hence 

/ ^101 \ 
^102 



^10m 
^201 



Linear operators on V are represented by n-by-n dimensional complex matrices in the 
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usual way. Given two operators A on V and B on W, 

Aii 61,2 ••• b hm \ 



A = 



( ^1,1 ai,2 • • • «i,n\ 

«2,1 «2,2 • • • «2,? 



5 



(3,2,2 • • • 0"n,n/ 



\b m ,l b mt 2 ■ ■ ■ bm^nj 



The operator A® B that acts on V <g> is now given by the Kronecker product of A 
and 5: 

/ a^ii? a lí2 B . . . ai, n B\ 

Ü2,lB 02,2-B • • • (í2,nB 

A®B = 

\a n ,\B a ní2 B ... a n ^ n B ) 
where a^B means that the submatrix B with all entries multiplied by a^j is to be inserted. 



A.3 The Bloch Sphere 

We will make use of a nice geometrical interpretation of single qubit states. It is known 
that all the observable properties of a single qubit system can be described using the unit 
sphere. The state of a single qubit system is described by a unit vector in a Hilbert 
space H2 of dimension 2 with orthonormal basis {|0), |1)}, 

|^) = a|0)+/?|l). 

We will call this basis the computational basis. As|a| 2 + |/3| 2 = 1, we can rewrite 

= e* 7 ^cos^|0) + e^sin^|l)^ . 
The global phase factor has no observable properties, and hence this state is equivalent to 

|V}=cos^|0> + e^sin^|l) 

with two real parameters 9 and ip. Now define 

x = sin 9 cos tp 
y = sin 9 sin ip 
z = cos 9 
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z 

A 




|1) 

Figure A.l: Bloch sphere representation of the computational basis states. 



and we have a mapping from the set of pure quantum states to the unit sphere. Figure 
lA.ll shows the Bloch sphere with the computational basis states |0) and |1). 
The other two axes of the Bloch sphere correspond to the eigenbases of X, 

/ |o) + |i) lQ)-|i) l 

l V2 ' V2 i 

and of Y, 

f |Q>+»|i> |Q>-»|i) l 

l y/2 ' y/2 J" 

We will abbreviate the basis states and will use {|+), |— )} and |— i)}- Figure IÏT~2l 

shows how these bases correspond to the three main axes of the Bloch sphere. 

This representation can be used to describe single qubit unitary evolutions in a nice 
geometrical way. We will first introduce the Pauli matrices as they have a natural repre- 
sentation as rotations on the Bloch sphere. 

Definition A.3.1. The Pauli operators are represented by the following matrices in the 



124 



Random Quantum States and Operators 



computational basis: 



On, 



l+X+l -!-)(-! = 

|0)(0|-|l)(lh 



1 

1 

-i 

1 

1 
-1 




*• x 



Figure A.2: Bloch sphere representation of the {|0), |1)}, {|+), |— )} and bases. 



Fact A.3.2. Any unitary operator U on 7í 2 can be decomposed as 

JJ _ e ia e -iO\(n x o x +n y cry+n z Oz) 

for real parameters a, 9 and a real unit vector n = (n x , n y , n z ). Acting on the Bloch sphere, 
U is a rotation by 6 about the n axis plus a global phase of e lCí that is not an observable 
property. 



We see that <7j corresponds to a rotation of n about the j axis. 
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A.4 Density Operators 

Definition A.4.1. A density operator is a positive operator p e L(7í) with trp = 1. 

Density operators are used to describe ensembles of quantum states. If we are given a 
state and the promise that it is \ipi) with probability p^, i — 1, . . . , k, we can incorporate 
our lack of knowledge about the state into a concise representation. This representation 
combines both the quantum mechanical concept of superpositions and the probability 
distribution over the set of states {ipi | i — 1, . . . , k}. 

Fact A.4. 2. Every ensemble {pi, \ipi)}i =1 has the associated density operator 

k 

p = ^pMÍH- 

1=1 

Every density operator has an associated ensemble {pi, \ipi)}i =1 - 

Definition A.4. 3. A pure state is a single quantum state that is known exactly. The 
density operator of a pure state is of the form p = \ip)(ip\. A mixed state is a state that is 
not pure. 

Fact A.4. 4. A state p is pure if and only if tr(p 2 ) = 1. 

We will state a useful fact that we will make use of later on. 

Fact A.4. 5. The Pauli operators together with the identity form an orthonormal basis 
{1, a x , a y , a z } for the space of linear operators L{7Í2) on a 2-dimensional Hilbert space. 

We can now reformulate the postulates of quantum mechanics in terms of density 
operators. We will mostly make use of this alternate notation in the remainder of this 
thesis. The formulation of the postulates has been taken from [NCOOJ. 



Postulate 1 To any isolated physical system is associated a Hilbert space, the state space 
of the system. The system is completely described by its density operator acting on the 
state space. If a quantum system is in state pi with probability Pi, the density operator 
for the system is Y2íPíPí- 



Postulate 2 The evolution of a closed quantum system from time t\ to time Í2 is de- 
scribed by a unitary transformation U that only depends on the times t\ and tï- 



p(t 2 ) = Up{h)U^ 
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Postulate 3 A quantum measurement is described by a set of measurement operators 
{M m }, where M m is a measurement operator acting on the state space of the system. 
The index m denotes the measurement outcome. The measurement operators satisfy the 
completeness relation 

m 

The probability of observing m on a quantum system in state p is 

p(m) = tr(MÍM mP ) 
and the state of the system after the measurement is 

M mP Mj 
p(m) 

Postulate 4 The state space of a composite system is the tensor product of the state 
spaces of the component systems. If each component system i, i — 1, . . . , n is prepared in 
the state p iy then the joint state of the composite system is 

n 
i=l 

Density operators are especially useful if we want to disregard some parts of a quantum 
system. There is an operation which is somehow inverse to the tensor product operation 
in the following way Imagine we have a quantum system comprised of two subsystems, 
A and B, with state spaces Ha and Hb- The joint system has the state space Ha <8> Hb- 
Let the system start in a state po, which we will write as p AB to denote that this is a state 
of the joint system. We let the joint system evolve to a state p AB , but at some point we 
choose to ignore part B. If the joint state of the system is not a product state, i.e. there 
are no density operators p A and p B such that p AB = p A ® p B , we cannot just ignore system 
B. We must assume that B will be modified later on behind our control, potentially being 
observed by an arbitrary measurement. It turns out that we can express our uncertainty 
about the future of system B in a probability distribution over possible states of system A. 
The density operator notation allows us to end up with one density operator for system 
A that covers both the state of system A and our lack of knowledge about the future of 
system B. 

Definition A.4.6. The reduced density operator for system A is defined as 



4 

P = tr B p 
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The partial trace is defined for any product state a A ® a B as 



tr B a A ®a B = a A tia B 



and extended to general density operators on AB by linearity. 

A.5 Topology and Group Theory 

We assume bàsic familiarity with group theory and provide this section as a reference. We 
refer the interested reader to [DF91] and |Wil70t IMun75j for a more in-depth coverage. 
Most of the defmitions were taken from [Rud67| . We will first introduce the bàsic notions 



of topology and group theory and merge both of them to define topological groups later. 
A.5.1 Topology 

Definition A.5.1. A topology r is a family of subsets of a set S if 

• S G t and G r and 

• t is closed under finite intersections and arbitrary unions. 

A set S with a topology r is a topological space, but most often r is assumed from the 
context and S itself is called the topological space. The elements Í6r are defined as open 
sets, their complements in S are closed sets. The elements of S are sometimes referred to 
as points in S. 

The smallest closed set containing A Ç S is the closure of A, written as À. The largest 
open set contained in A is the interior of A, denoted A. An interior point of A is an 
element p G A. If p is an interior point of A, then A is a neighbourhood of p. 

Definition A.5. 2. Let r by a topology on S, and let T Ç S be a subset of S. Then T 
becomes a topological space with topology r' = {X HT\X e r}. r' is called the subspace 
topology induced by T. 

Definition A.5.3. A topological space S is called Hausdorff if for every pair of distinct 
points Pi,p2 G S, there are disjoint neighbourhoods Ni and A" 2 of pi and P2, respectively. 

Definition A.5.4. A set A Ç S 1 is called compact if each family of open sets whose union 
contains A has a finite subfamily whose union contains A. 



Fact A.5. 5. Every closed subset of a compact space is compact. Every compact subset of 
a Hausdorff space is closed. 
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Definition A.5.6. A function / : X — > Y from a topological space X to a topological 
space Y is continuous if f^ 1 {E) = {p E X \ f(p) G i?} is open in X for every open set 
E Ç. Y. If /(-E) is open in Y for every open set E in X, then / is called an open map. 
If / is one-to-one, f(X) = Y, and both / and f^ 1 are continuous, then / is called a 
homeomorphism of X onto Y. 

Fact A.5.7. Let X and F be topological spaces. If K Ç X is compact and / is continuous, 
then /(X) is compact. 

Definition A.5.8. Let S be a topological space. Denote by C(S) the set of all bounded 
continuous complex- valued functions on S. The support supp / of a complex function / 
in S is the closure of {p E S | /(p) 7^ 0}. The set of all functions / G C(S) with compact 
support is denoted by C C (S). 

Let / G C(S). If for any e > 0, there is a compact set K in 5 such that \f(p)\ < e 
holds for all p G S\K, then / vanishes at infinity. The set of all / G C(S') that vanish at 
infinity is denoted C (S). 

Fact A.5.9. Let S be a compact space. Then C(S') = C C (S) = C (S). 

Definition A.5.10. Let S'í, S 2 , ■ ■ ■ , S n be topological spaces. Then S = S± x S 2 x • • • x 
can be given the following product topology. For a choice of indices ii,Í2, ■ ■ ■ ,ik an d open 
sets Ví j Ç S*^, 1 < j < k, define V = {(pi, • • • ,p n ) £ 5* | G V^., 1 < j < k} and define a 
subset E of «S as open iff it is the union of such sets V . 

Fact A.5.11. Let Si, S 2 , ■ ■ ■ , S n be Hausdorff spaces. Then S — Si x • • • x S n is Hausdorff 
as well. If the Si are compact, then S is compact. 

A.5.2 Topological Groups 

Definition A.5.12. A group is a pair (G, *) of a set G and a binary operation ★ : GxG — > 
G such that 

• for all a, 6, c G G : a * (b * c) = (a * 6) * c (associativity), 

• there is an element e E G such that a*e = e*a = a for all a E G (identity element), 
and 

• for all a E G, there is an element gT 1 G G such that a ★ gT 1 = a" 1 * a = e (inverse 
element). 

We will usually omit the operation * and will write ab to denote a * 6. 
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Definit ion A.5.13. The left translate of a subset S Ç G by an element a G G is the set 

a·kS = aS = {a·ks\s<E S}. The right translate oï S by a is Sa = {s * a \ s e S}. 

Definition A.5.14. A homomorphísm (j) : G — > H from a group (G, o) to a group (iï, *) 
is a mapping that satisfies 

(f)(xoy) = <j)(x) -k <f>(y) 

for all x,y E G. 

Fact A.5.15. The set of all complex numbers of absolute value 1 forms a group under 
multiplication. With the usual topology taken from the complex numbers it forms the 
compact group T. 

Definition A.5.16. A character of a group G is a homomorphism \ '■ G i— > T into the 
multiplicative group of complex numbers a G C such that \a\ = 1. We will call a character 
trivial if x(g) — 1 fo r all 9 ^ G. 

Fact A.5.17. Let % be a character of a finite group G. Then x(50' G ' — xO-g) — 1- Thus 
the vàlues of x are |G|-th roots of unities. 

Definition A.5.18. A topological group is a group G that is a topological Hausdorff space 
with a topology r such that the map (x,y) i— > xy^ 1 : G x G — > G is continuous. If the 
whole group G is compact, we will call it a compact group. 

In the following, let G be a topological group. 

Fact A.5.19. The translation map t x (y) = xy and the inversion x i— > are homeomor- 
phisms of G onto itself. If A is an open set of G and B Ç G, then Ai? is open. If A and 
i? are compact, Ai? is compact. 

Definition A.5.20. Let / be a complex-valued function on G. Denote by / the function 

7(9) = 7Ï9^Ï 

for all g G G. 

Fact A.5.21. The set of all invertible linear operators on an n-dimensional Hilbert space 
7i forms a group under multiplication, the general linear group GL(n). It becomes a 
topological space with the topology of element-wise convergence in C of the coordinate 
functions. The subset of all unitary operators forms a group under multiplication. Given 
the subspace topology, it becomes as compact group which we will denote by U(n). 
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A.6 Haar Measure and Integrat ion on Compact Groups 

In this section we will present the notion of an integral over a compact group G. We will 
introduce the bàsic concepts of measure theory first and show what is typically understood 
as an integral over G. We refer the reader to and make use of the notation given in [Eud67 
and |Edw72j for a concise presentation of measure and integration theory for compact and 
locally compact abelian groups. See |Rud73j for an introduction to functional analysis. 

For the introduction to measure theory, let X be a compact Hausdorff space. Let B be 
the smallest family of subsets of X such that 

• B contains all closed subsets of X, 

• it is closed under finite unions, and 

• it is closed under complementation. 

The elements of B are called the Borel sets of X. 

Definition A.6.1. A measure on X is a set function /i : B — > C such that 
1. /i is countably additive, i.e. 




for a countable family of pairwise disjoint Borel sets Ei G B, and 

2. fi(E) is finite for all E G B. 

Definition A.6. 2. Let /íbea measure on X. fi is positive if \x is real-valued and fi(E) > 
for all Borel sets E. 

Definition A.6. 3. A measure /i is a probabílity measure if /i is positive and fi(G) = 1. 
Definition A.6.4. Let /i be a measure on X. We define the total variation of /i by 

H(S) = supÇ|/x(í? f )| 

i 

where the supremum is taken over all finite collections of disjoint Borel sets Ei whose union 
is E. The total variation measures the largest variation of /i over all possible subdivisions 
of a Borel set E and is used to define a norm on an arbitrary complex-valued measure. 
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Fact A.6.5. For every measure /i, \fi\ is a measure as well. If /i is a positive measure, then 

fi = \fl\. 

Definit ion A.6.6. A measure fi on X is called regular if 

H( J B) = supH(Ar) = iiifN(v) 

where ranges over all compact subsets of i?, and V ranges over all open supersets of E. 
Let ||«|| = |«|(X). Define 

M(X) = {fi | fi measure on X, ||«|| finite }. 

Let G denote a compact group. 

Definition A.6.7. The left translation operator for a G G is a function L a : C(G) — ► C(G) 
that maps / to its left translate 

L a f ■ x i-> f{a~ l x). 
Analogously, the right translation operator R a is given by 

R a f : x h-> /(xa -1 ). 

Definition A.6.8. A measure fi is left translatíonally invariant if fi(aS) = fi(S) for all 
S e B, a e G. «is n<?/ií translatíonally invariant if fi(Sa) = fi(S) for all 5 G i3, a G G. 

Fact A.6.9 (Existence of the Haar Measure). There is a unique left and right trans- 
lationally invariant measure on G such that m(G) = 1. This measure m is called the Haar 
Measure on G. 

Using the theory of Lebesgue integration, it is possible to define the notion of integrating 
over the compact group G with respect to some measure fi. We also assume the definition 
of an integrable function. See |Rud73j for a detailed introduction to Lebesgue integration. 

Definition A.6.10. Let u be a measure on G and / an integrable complex- valued function 
on G. Denote the set of all integrable functions on G as T(G). We denote the Lebesgue 
integral as 

/ fàfi 

JG 

or 

/ f{g)dfi{g) 

JG 

if the variable of integration is not clear from the context. If fi is the Haar measure m 
and the group G is clear from the context, we will also write f f(g)dg. Note that every 
/ G C{G) is integrable. 
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Definition A.6.11. A measure /i G M(G) is called discrete if fi(G) = u-(H) for some 
countable subset H of G. We call fi continuous if fJ-(E) = for every countable set E. 
fi G M(G) is absolutely continuous if u-(E) = whenever m(E) = for all Borel sets E. 



It will be important to integrate over the set of all pure quantum states. We have seen 
that the set of all pure quantum states of a single qubit system can be identified by a real 
three-dimensional unit sphere. In that case, integration over the set of all pure states is 
equivalent to integration over the real unit sphere. Although there is no clear geometrical 
picture of the state space of multi-qubit systems, we can still define an invariant measure 
and thus integration over the set of pure states of an N qubit system. 

Fact A.7.1. There is a unitarily invariant measure on the set of all pure quantum states 
of a Hilbert space 7í. This measure is typically referred to as the Fubiní-Study measure 
and written as 



Note that the Fubini-Study measure is also the unitarily invariant uniform measure on 
CS d ~ 1 . See |VK93| Ch. 11] for a more rigorous introduction of the invariant measure on 
CS d ~ 1 ) which is denoted as Pç' 1 in there. 

A.8 Representation Theory 

We will introduce some of the fundamental concepts of representation theory in this sec- 
tion. We assume some bàsic familiarity with the tòpic and refer to |Edw72j for a brief 
introduction to representation theory. |Boe67| IBoe70j provides a complete but lengthy 
approach including proofs for all results. |FH91j presents a more modern approach to 
representation theory and especially the representation theory of the general linear group 
GL{n) and its subgroups, especially U{n). [IVOOJ derives the representation for quite a 
number of elementary groups. 
Let G be a compact group. 

Definition A.8.1. A representation of G is a homomorphism 



A.7 Fubini-Study Measure 




U :G^ GL(n) 
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of G into the general linear group of invertible linear operators on an n-dimensional complex 
vector space Tíu- Tíu is the reprès entation space of U, and dim7i;y is called the dimension 
of the representat ion U. 

Definition A.8.2. Let V be a representation of G. The action of G on Tíy is defined as 

9 v = V(g)v. 

We will require Tíu to be a finite-dimensional Hilbert space with basis {l^i), IV^), ■ ■ ■ , li'n)} 
and U to map to the group of unitary operators on Tíu- Furthermore, we will equip Tíu 
with its usual topology as a complex Euclidean space. We require that the coordinate 
functions 

g^(^\U(g)\^) 

are continuous for all 1 < i, j < n. 

Our restrictive definition is justified by the fact that all continuous representations 
are unitarily equivalent and that every finite-dimensional measureable representation is 
continuous |Edw72j . 

Definition A.8.3. A representation U is called irreducible if there is are no subspaces of 
Tíu other than the trivial ones, {0} and Tíu, which are invariant under U(g) for all g G G. 
Otherwise, the representation is called reducible. 

Definition A.8.4. Let V, W be representations of G. A G-homomorphism from Tíy to 
Tíw is a linear map that respects the group action, i.e. 

Vg E G, v G Tíy : <p(gv) = 9(<p(y))- 

Writing the group action explicitly, this becomes 

Wg G G, v G Tíy : <p{V{g)v) = W(g) V (y). 

Fact A.8.5 (Schur's lemma). Let V and W be irreducible representations of a group G 
and let ip : Tíy — > Tíw be a G-homomorphism. 

1. Then (p = or tp = XI for some À G C 

2. If Tíy = Tíw, V is an isomorphism. 
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Fact A.8.6. Every reducible representat ion U on 7íu can be decomposed into a finite 
direct sum of irreducible representat ions acting on invariant subspaces of TCu- 

U{g) = U 1 {g)®U 2 {g)@---®U k {g) 

where Ui is an irreducible representat ion of G and Ui(g) is a unitary operator on a subspace 
Hi such that ®^ =1 Tïi = Tïu- 

Definition A.8.7. Two representations U and V of G are equivalent if there is an iso- 
morphism A from Tiu onto Tíy such that 

AU(g) = V(g)A 

for all g G G. If A is unitary, then U and V are called unitarily equivalent. 

Let U be a representat ion of G. 

Definition A.8.8. We will call the matrix elements Uij(g) the coordínate functions of the 
representation U. 

Definition A.8.9. The character of a representation U of G is a function G C(G) 
defined as 

Xu(g) = trU(g). 

We will usually write \ if the representation used is clear from the context. 
Fact A.8.10. Every character \ of G is continuous. 
Fact A.8.11. Let x be a character of G. Then 

xig' 1 ) = x(g) 

for all g G G. 

As an example, we will consider the representations of the compact group U(d) for some 
d G N. We refer to jVKÏÏTl Ch. 6] for a treatment of SU (2) and GL{2). See |VK93l Ch. 11] 
for various analytical expressions of the irreducible representations of U(d) for general d. 
As the actual matrices of the irreducible represenations of U(d) are rather complicated, 
we will skip them here and present only the necessary formulas for the dimensions of its 
irreducible representations. 

Fact A.8.12. The irreducible representations D s of U(d) are labelled by two integers 
[VK93j| s = (k, l), k, l G N and their dimension is given by 

k + l + d- l fk + d-2\fl + d-2\ 
d{k ' l) = d-1 { k ){ l )■ 
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A.9 Fourier Analysis 

The material in this section follows the notation set in |Edw72j . We refer to |Rud73j for 
the details of Banach space theory. See [f AOBR80J for the approximation of Fourier series. 
However, we note slight differences in the placement of complex conjugates and global 
dimensionality factors between |Edw72| and jfAOBR80 . In this section, we assume G a 
compact group and m the Haar measure on G. 

A.9.1 Banach Spaces 

The concept of a mètric is the gener alization of the concept of distance in a Euclidean 
space. It is generalized to arbitrary sets in the following way 

Definition A.9.1. A mètric on a set X is a function d : X x X — > R such that 

1. d(x,y) > for all x, y G X (non-negativity), 

2. d(x,y) = if and only if x — y (definiteness), 

3. d(x,y) = d(y,x) for all x,y G X (symmetry), and 

4. d(x, z) < d(x, y) + d(y, z) for all x, y, z G X (triangle inequality). 
The pair (X, d) is called a mètric space. 

Fact A.9. 2. A normed vector space (V, || • ||) is a mètric space with respect to the mètric 
d(u, v) = || u — v||. V can be given the usual topology induced by its norm to turn V into 
a topological vector space. 

Definition A.9. 3. A Banach space is a complete normed complex vector space. 
Fact A.9.4. Every Hilbert space is a Banach space. 

Fact A.9. 5. Let p be a positive real number. Then the function || ■ || p : T(G) i— ► R defined 

as 



is a norm on T(G) . 

Definition A.9. 6. A function / G C(G) is zero almost everywhere if m ({g G G \ f(g) = 0}) = 
0. Two functions f,g G C{G) are equal almost everywhere if / — g is zero almost every- 
where. 
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Fact A.9.7. C(G) is a Banach space with addition of functions f,g defined in the usual 
way as (/ + g)(x) = f(x) + g(x). L P (G) — {f E T{G) \ \\f\\ p is finite}, with two functions 
identified if they are equal almost everywhere, is a Banach space. 
L 2 (G) is a Hilbert space with inner product 

(f,g) = / f(x)g(x)dm(x), 
Jg 

where g denotes the usual complex conjugate. 

For 1 < q < p, LP(G) Ç L q (G). Furthermore, C(G) Ç L P (G) for any p > 1. 

It is important to notice that functions in L P {G) are not defined point-wise, as m({x}) = 
for all x. The definition only makes sense if we are interested in the way they are 
integrated against certain measures. 

A.9.2 Fourier Analysis 

Definition A.9.8. Let G be the set of all representations of G where equivalent repre- 
sentations are identified and one representative of each equivalence class is chosen for G. 
Thus G = {D s } is the set of pairwise inequivalent, irreducible unitary representations of 
G. We will order the irreducible representations D s by their increasing dimensionality d s . 
Note that the trivial representation D s (g) = 1 for all g E G has dimension d = 1. 

Definition A.9.9. Let / E and U E G. The Fourier tmnsform of / is defined for 

each representation D s as 

f(D s ) = [ f(g)D s (g)dg. 
Jg 

Note that f(D s ) E L(TCu)- As f(D s ) is a d s x d s complex matrix, it will not be useful to 
treat / as a function. 

Definition A.9.10. The convolutíon of two functions /, g E T(G) is defined as 

(f*g)(x) = J f{y)g{y' 1 x)dm{y). 

Convolution is an associative operation. 
Fact A.9.11. Let f,g E 1(G), D s E G. Then 



f*g(D s ) = f(D s )9(D s )- 
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We can also define the Fourier transform for measures, which represent a more general 
class of functions than those in L 1 (G). This transformation is typically referred to as the 
Fourier- Stieltj es Transform. We will first introduce a correspondence between measures 
and functions in L l . 

Fact A.9.12. If / G then the measure n{E) = f E fdm is in M(G) and absolutely 

continuous. For every absolutely continuous measure \x G M(G), there is a function / G 
L l (G) such that fJ.(E) = f E fdm for all Borel sets E. Furthermore, \\fi\\ = \\f\\i- 

Definition A.9.13. The Fourier- Stieltj es transform of a measure fi G M{G) is given by 

fi{D s ) = í D s (g)d l· c(g). 
Jg 

Definition A.9.14. Associate with every Borel set E of G the set E 2 = {(x,y) G G x 
G \xy E E}. Then _E 2 is a Borel set of G 2 . The convolution of two measures fj,, \ G M(G) 
is defined as 

(^A)(]?) = (/íxA)(í? 2 ), 
where /i x À is the product measure on the product space G 2 . 

Fact A.9.15. For fx, \ G M(G), we have /i * X G M(G). Convolution is associative and 
commutative. Finally, ||//*A|| < ||/i||||À||. 

Fact A.9.16. Let /i, A G M(G), JJ S G G. Then 

/I7A(/J S ) = fi(D s )X(D s ). 

Fact A.9.17. The normalized coordinate functions \fd~ s D\^ : g i— > \fd a Dlj{g) form a 
complete ortho normal set for the Hilbert space L 2 {G). The orthogonality relations read 

J G D Íj(9)D s ^ n (g)dg = j5 s ^5^ m 5 j)n . 

It follows that 

/ tr (D s ( í? )( J D s ^))t)^ = ( 5 S)S , 

Fact A.9.18. Let / G for p > 1 or p = oo. If f(D s ) = for all D s G G, then 

f(g) = for almost all g. As a corollary, let /, g G LP(G). If /(-D s ) = ^(L> s ) for all £> s G G, 
then f = g almost every where. 
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Fact A.9.19. The Parseval formula is the following integral identity for / G L 2 (G): 

II/II2 = ^j G \f(9)\ 2 dg = ^d 8 trf(D')f(D>y. 

Fact A.9.20. The Peter-Weyl-Theorem states as a direct consequence that 

f(g) = J2dstrf(D s )D s (9) j 

almost everywhere, the limit being the strong limit in L 2 (G) of its partial sums over finite 
PÇG. 

Fact A.9.21. The Riemann-Lebesgue Lemma states that for any / G 

hm \\f(D s )\\=0. 

s^oo 

Definition A.9.22. A function <p £ L l (G) is called positive-definite if for all / G C(G), 
Cf*0*/)(e)= / / 4>(h- x g)~Mf{h)dgdh>ü. 

JG JG 

Definition A.9.23. P(G) is the set of all continuous positive-definite functions on G. 

Fact A.9.24. A function G ^(G) is positive-definite if and only if <p(D s ) is positive 
self-adjoint for all s. 

Definition A.9.25. A "nice" positive-definite function is a positive-definite function cf) on 
G such that 

• is continuous (i.e. <fi G P(G)) or 

• there is a number and a neighbourhood of e G G such that 

(/*tf*/)(e)<m,||/||? 
for all / G C(G) whose support is contained in N$. 

Fact A.9.26 í |fAOBK.80l IEdw72j ). Another version of the Peter-Weyl-Theorem de- 
scribes that the Fourier series of a nice positive-definite function / converges uniformly for 
all (almost all if / is not continuous) g G G, 

/(^limVitr/p^^t 
5^00 ^— ' 

s<S 

where we ordered the irreducible representations D s by their increasing dimensionality d s . 
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Fact A.9.27 (|fAOBR80, Edw72j). A consequence is the Peter-Weyl Approximation 
Theorem. For all ni ce positive-definite / and all e > 0, there is a number N e such that 



f(g)-J2d s trf(D s )D s ( g y 



s=0 



< e 



for all (almost all if / is not continuous) g G G. 



A.10 Fields and Rings 

Although we have used fields and assumed bàsic familiarity with them as mathematical 
objects, it it necessary to give the exact definition. It will be important to distinguish 
fields from rings to understand different constructions involved in this thesis. We refer to 

LN94] and |McD74j for a general treatment of finite fields and finite rings. |Wan97j and 

Wan03j deal with Galois fields and Galois rings in particular. 

Definition A.10.1. A ring (R,+,~k) is a set R together with two binary operations + 
(addition) and * (multiplication) such that 

• (R, +) is an abelian group, 

• a-k (b-k c) = (a-kb) * c) for all a,b,c G R (multiplicative associativity), and 

• a* (b + c) = (a * b) + (a * c) and (a + b) * c) = (a * c) + (ò * c) for all a, b, c G R 
(distributivity). 

We will typically denote the ring operations as addition and multiplication and we will 
understand that ab means a-kb. 

Definition A.10.2. A field (F, +,*) is a set F together with two binary operation + and 
* such that 

• (F, +, *) is a ring where we denote the additive identity with 0, 

• (F\{0},*) = F* is an abelian group with multiplicative identity 1^0, and 

• ab = implies a = or b = for all a, b G F. 

If F is finite, we will call it a finite field or Galois field. 

Definition A.10.3. A subring of a ring is a subset S Ç R such that S is closed under + 
and *, and forms a ring with respect to these operations. 
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Definition A.10.4. An ideal of a ring R is a subset J Ç R such that J is a subring of R 
and ar, ra G J for all a G J, r G -R. 

Fact A.10.5. Let R be a commutative ring with multiplicative identity 1. An ideal J is 
principal if there is an a G i? such that J — (a) — {ra \ r G -R}. We will call J generated 
by a. 

An ideal J partitions a ring i? into disjoint cosets [a] = a + J = {a+j | j G J}. Elements 
a and 6 in the same coset or residue class of J are called congruent modulo J and we will 
write a = b mod J. This is equivalent to a — 6 G J. 

Fact A.10.6. The set of residue classes of a ring i? modulo an ideal J forms a ring if we 
define addition and multiplication of residue classes by letting 

• (a + J) + (6 + J) = (a + b) + J and 

• (a + J)(b + J) = (ab + J) 

for any a, b G R. It is called the residue class ring and denoted by R/J. 

Definition A.10.7. The characteristic of a ring i? is the smallest positive integer jigZ 
such that nr = for all r G R. If there is no such integer n, we say that R has characteristic 
0. 

Fact A.10.8. Let R be a commutative ring with prime characteristic p. Then 

(a + bf n = a pn + 6 P ' 1 

for all a, b G R and all ra G N '. 

Fact A.10.9. Any finite field has prime characteristic. 

Fact A.10.10. For any prime power p k , all finite fields with p k elements are isomorphic 
and we write F p k or GF(p k ) to denote the finite field with p k elements. All finite fields 
have a prime power number of elements. 

Fact A.10.11. For a ring R, the set of polynomials 

n 

with a,i G R and X a formal variable form a ring under usual addition and multiplication 
of polynomials, with the zero polynomial. 

Definition A.10.12. The ring of polynomials over R is called the polynomial ring over R 
and is denoted by i?[X]. 

Definition A.10.13. A polynomial p(X) = Y^i=i * s called monic if a n — 1. 
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A.10.1 Galois Fields 

For any prime p, a usual method to construct the finite field F p is to take the integers 
modulo p which forms a field with p elements. We will now describe how fields with a 
prime power number of elements p m , m G N, can be constructed. 

For matters of simplicity, we will use a simpler approach than the one presented in 
LN94 . The approach taken here is streamlined to facilitate later constructions and ease 
understanding for the purpose of applications of finite fields to this thesis. Furthermore 
this section should provide the reader with some intuition about the structure of finite 
fields. 

Definition A.10.14. A polynomial p(X) G F P [X] is primitive if there are no polynomials 

r(X),s(X) G F P [X] suchthatp(X) = r(X)s(X) and r(X), s(X) ^ p(X) andr(X),s(X) ^ 
1. Intuitively, this is similar to the definition of a prime number and will serve an analogous 
purpose. 

This definition implies that a primitive polynomial p{X) is irreducible, as it cannot 
have any root £ for it would lead to a factorization (X — Ç)\p(X). We will now use a monic 
primitive polynomial to define GF(p m ) as a residue class ring of F P [X] which will turn out 
to be a field. 

Theorem A.10.15. Let h(X) G F P [X] be a monic primitive polynomial of degree m G X. 
Then F p [X]/ (h(X)) is a finite field with p m elements. We will denote this field by GF(p m ). 
It is sometimes called an extension field. 

Proof. h(X) is a monic polynomial of degree m, hence the remainders of polynomials 
in F p [X] after division by h(X) are polynomials of degree up to m — 1. If we pick the 
lowest-degree representat ive for the coset in F p [X]/(h(X)), we have that 

F p [X]/(h(X)) = {a + a x X + ■ ■ ■ + a m ^X m ~ l \a ,a u ..., a m . x G F p }. 

The Extended Euclidean Algorithm shows that for any /(X) G GF(p m ), there is an inverse 
/ _1 (X) such that f(X)f~ 1 (X) = 1 and that there are no zero divisors. □ 

Fact A.10.16. Any extension fields over F p with monic, primitive polynomials hi(X), /i2(X) 
of degree m are isomorphic. That justifies the label GF(p m ) that is independent of the 
primitive polynomial that generates the extension field, which justified to speak of the 
finite or Galois field with p m elements. 

From the proof of the preceding theorem it is apparent that we can identify polynomials 
in GF(p m ) as vectors with m components. Usually, elements of the extension as well as 
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elements of the base field F p are denoted by latin letters. To avoid confusion, we will use 
greek letters for the extension field and latin letters for the base field in case we will have 
to mix both. We will use 

/ «0 \ 
ai 

\a m -ij 

to denote the vector associated to a when we consider the vector space F™. 

Fact A.10.17. GF(p m ) is an m-dimensional vector space over F p , denoted by F™. We 
can also equip F™ with an inner product to turn it into an inner product space. We will 
conveniently use the Standard inner product 



m—1 

la, (3) = ^aipi. 

i=0 



As a basis, we can pick the polynomials {1,X, X 2 , . . . ,X m_1 }. Then, the vector rep- 
resentation of a polynomial p(X) G GF(p m ) is given by the column vector of its m coeffi- 
cients. It follows from the distributivity of multiplication and addition that multiplication 
is a linear function on GF(p m ) as a vector space. 

Fact A. 10.18. For every a G GF(p m ), there is a matrix M a G F™ xm such that a& = M a h 
for all b G GF(p m ). 

There is a very important function from GF(p m ) to F p , the trace mapping. 

Definition A.10.19. The trace is a mapping tïGF( P m ) '■ GF(p m ) — > F p such that 

m—1 



tr GF(p™)(a) = a p \ 



i=0 



If it is unambiguously clear from the context, we will write "tr" instead of "tr^^m)" . Note 
that this function is referred to as absolute trace as it maps to the prime field F p . 

The trace is a very nice function with interesting properties. 

Fact A.10.20. The trace is a linear functional on GF(p m ), i.e. 

• tr(ct + P)= tr(a) + tr(/3) for all a, (3 G GF(p m ) and 

• tr(ca) = ctr(a) for all a G GF(p rn ), c G F p . 
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From the linearity of the trace it follows that there is a vector t such that tra = (t,a) 
Furthermore 

• tr(a) = ma for all oGF p and 

• tr(a p ) = a for all a G GF(p m ). 
Fact A.10.21. Let a G GF(p m ). Then 

trax I V m a = 



J2 {e 2m/p y 

x£GF(p m ) 



a ^ 



A.10.2 Galois Rings 



The construction of Galois rings is quite similar to the construction of Galois fields. How- 
ever, we will not consider the most general case of Galois rings but restrict ourselves to the 
case of rings over the base ring Z4. We refer to Wan97j for a complete coverage of Galois 
rings over 1 A and Wan03j for the more general case of a Galois ring over Z n for arbitrary 
n G N. Note that we will write Z 2 instead of F 2 for easier reading. Furthermore, we will 
give a slightly stricter definition of a Galois ring than is usually adopted in the literature 
to focus on the specific results needed for later constructions. 

Defmition A.10.22. The map~: Z 4 [X] -> Z 2 [X\ is defined for /(X) = a + a x X + h 

f{X) = (oq mod 2) + (ai mod 2)X + . . . (a n mod 2)X n . 



Definition A.10.23. We will call a polynomial h(X) G Z 4 [X] bàsic primitive if h(X) is 
a primitive polynomial in Z 2 [X] . 

Definition A.10.24. Let h(X) G 1ía[X] be a monic, bàsic primitive polynomial of degree 
m. Then the residue class ring 

Z 4 [X]/(h{X)) 9É {a + ai X + ■ ■ ■ + a m ^X m - 1 \ a , a u . . . , a m _ x G Z 4 } 

is called the Galois ring and denoted by GR(4 m ). We say that ft-pí) generales the Galois 
Ring. We will call this polynomial representat ion of GR(4 m ) the additive reprès entation. 

Fact A. 10.25. Any two Galois rings over with monic, bàsic primitive polynomials 
h\(X), hz(X) of degree m are isomorphic. That justifies the label GR(A m ) which is inde- 
pendent of the primitive polynomial that generates the residue class ring. 
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Fact A. 10.26. GR(A m ) has characteristic 4. 

There is a close connection between Galois fields and Galois rings that stems from the 
fact that the generating polynomial h(X) is bàsic primitive and thus h(X) is primitive 
in 1,2- Hence the Galois field with 2 m elements is contained in the Galois ring with A m 
elements. 

Fact A. 10.27. GR(A m ) ^ GR(A m )/(2) = GF(2 m ). 

However, it is not true that the Galois ring is a trivial product of a Galois field and 
another simple structure. It turns out h(X) ensures a richer structure that cannot easily 
be derived from h(X). Besides the additive representation, there is a second representation 
that gives more insight into the structure of GR(A m ). This second representation is called 
the 2-adic representation. 

Fact A. 10.28. The element X G GR(A m ) is of order 2 m - 1 and is the root of a unique 
monic bàsic primitive polynomial h(X) of degree m that generates GR(A m ). 

The 2-adic representation is facilitated by the powers of X. 

Definition A. 10.29. The set T m = {0, 1, X, X 2 , X 3 , . . . , X 2 ™" 2 } is called the Teichmüller 
set of the Galois ring GR(A m ). 

Notice that T\{0} is a cyclic multiplicative group generated by X. 

Fact A.10.30. Let T m be the Teichmüller set of GR(A m ). Then for any element c G 
GR(A m ), there are unique a,b E T m such that 

c = a + 2b. 

Using the 2-adic representation we can define a trace function for Galois rings. 

Definition A. 10.31. The generalized Frobenius map of GR(A m ) is defined as 

/ : GR{A m ) -> GR{A m ), c = a + 2b^a 2 + 2b 2 . 

Fact A.10.32. The generalized Frobenius map is a ring automorphism of GR(A m ). The 
fixed elements of / are the elements of Z 4 . Furthermore, / is of order m. 

Definition A.10.33. The generalized trace is a mapping tr Gfi ( 4 m) : GR(A m ) — > Z 4 such 
that 

m— 1 

^GR{i m ){ a + b) — ^2 ° 21 + 2k 2ï 

where a + 2b G GR(A m ) is the 2-adic representation of an element of GR(A m ). If it is 
unambiguously clear from the context, we will write "tr" instead of "tr GÍÏ ( 4 m)". 
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In analogy to the field trace, we also have the nice property that the trace is linear. 
Fact A. 10.34. The trace is a linear functional from GF(4 m ) to Z 4 . 
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